< » ¥ » m 7 4 ’ ¢ . . 
P’ 4 - of he. - ‘ ° 7 “ 
4 ‘ . 
Ls + we 5 e/>* + ay . ~~ «/ oe ;* 
7 ° 
; 2 ’ ke % vy - 
Sey. ote . Ay? ‘ , 
< *4 iw 7 . . , 
,' 7 ’ oven *, ° “ . f af ~- 
. ° e » O . i~é : oo -* a? al ‘ e 
of = — _ ‘ ot ° . 5 wa, ahs La 
A , ’ . - - al »* * ' 
P a ° , <*> ™ 4 ” ee it 
s f © 4 x4 . , : _ as 
7 ‘ ) S . s 7 a A - 
al he nq . i : : ‘ 7 y - ‘, ’ . 
> e 4 f " : > . ~ 
- . ° » , } 7 . ae i 
: ; NI YP ee ey 
; : bow SAty® FY WA : i ot 
é , Li W, . A | 4 - < 
: » : 
7 ‘4 . 7 > . 


- . 7 - 7 


A COURSE OF 


Higher Mathematics 


VOLUME Ill 
PART ONE 


V. I. SMIRNOV 


Translated by 
D. E. BROWN 


Translation edited by 


I. N. SNEDDON 


Simson Professor in Mathematics 
University of Glasgow 


PERGAMON PRESS 


OXFORD-LONDON-EDINBURGH-NEW YORK 
PARIS-FRANKFURT 


1964 


CONTENTS 


INTRODUCTION 
PREFACE TO THE FourtTH Russian Epition 


CHAPTER I 


DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS 


§ 1. Properties of determinants 
1. Determinants. 2. Permutations. 3. Fundamental properties of 
determinants. 4. Evaluation of determinants. 5. Examples. 6. Multipli- 
cation of determinants. 7. Rectangular arrays. 


§ 2. The solution of systems of equations 

8. Cramer’s theorem. 9. The general case of systems of equations. 
10. Homogeneous systems. 11. Linear forms. 12. n-dimensional vector 
space. 13. Scalar product. 14. Geometrical interpretation of homo- 
geneous systerns. 15. Non-homogeneous systems. 16. Gram’s deter- 
minant. Hadamard’s inequality. 17. Systems of linear differential equa- 
tions with constant coefficients. 18. Functional determinants. 19. 
Implicit functions. 


CHAPTER II 


LINEAR TRANSFORMATIONS AND QUADRATIC FORMS 


20. Coordinate transformations in three-dimensional space. 21. General 
linear transformations of real three-dimensional space. 22. Covariant 
and contravariant affine vectors. 23. Tensors. 24. Examples of affine 
orthogonal tensors. 25. The case of n-dimensional complex space. 26. 
Basic matrix calculus. 27. Characteristic roots of matrices and reduc- 
tion to canonical form. 28. Unitary and orthogonal transformations. 
29. Buniakowski’s inequality. 30. Properties of scalar products and 
norms. 31. Orthogonelization of vectors. 32. Transformation of a quad- 
ratic form to a sum of squares. 33. The case of multiple roots of the cha- 
racteristie equation. 34. Examples. 35. Classification of quadratic 
forms. 36. Jacobi’s formula. 37. The simultaneous reduction of two quad- 
ratic forms to sums of squares. 38. Small vibrations. 39. Extremal 
properties of the eigenvalues of quadratic forms. 40, Hermitian 
matrices and Hermitian forms. 41. Commutative Hermitian matrices. 
42, The reduction of unitary matrices to the diagonal form. 43. Pro- 
jection matrices. 44. Functions of matrices. 45. Infinite-dimensional 


Vv 


vii 
ix 


30 


70 


vi 


CONTENTS 


space. 46. The convergence of vectors. 47. Complete systems of 
mutually orthogonal vectors. 48. Linear transformations with an 
infinite set of variables. 49. Functional space. 50. The connection 
between functional] and Hilbert space. 51. Linear functional operators. 


CHAPTER III 


THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS 


OF GROUPS 


52. Groups of linear transformations. 53. Groups of regular polyhedra. 
54, Lorentz transformations. 55. Permutations. 56. Abstract groups. 
57. Subgroups. 58. Classes and norma] subgroups. 59. Examples. 
60. Isomorphic and homomorphie groups. 61. Examples. 62. Stereo- 
graphic projections. 63. Unitary groups and groups of rotations. 
64. The general linear group and the Lorentz group. 65. Represen- 
tation of a group by linear transformations. 66. Basic theorems. 
67. Abelian groups and representations of the first degree. 68. Linear 
representations of the unitary group in two variables. 69. Linear repre- 
sentations of the rotation group. 70. The theorem on the simplicity of 
the rotation group. 71. Laplace’s equation and linear representations of 
the rotation group. 72. Direct matrix products. 73. The composition of 
two linear representations of a group. 74. The direct product of groups 
and its linear representations. 75. Decomposition of the composition 
D xD, of linear representations of the rotation group. 76. Ortho- 
gonality. 77. Characters. 78. Regular representations of groups. 
79. Examples of representations of finite groups. 80. Representations 
of s linear group in two variables. 81. Theorem on the simplicity of 
the Lorentz group. 82. Continuous groups. Structural constants. 
83. Infinitesimal transformations. 84. Rotation groups. 85.Infinitesimal 
transformations and representations of the rotation group. 86. Repre- 
sentations of the Lorentz group. 87. Auxiliary formulae. 88. The forma- 
tion of groups with given structural constants. 89. Integration over 
groups. 90. Orthogonality. Examples. 


InpDEXxX 
VoLumes PUBLISHED IN THIS SERIES 


188 


INTRODUCTION 


A BRIEF account of the history of this five-volume course of higher 
mathematics has been given in the Introduction to Vol. I of the 
present English edition. This volume and the subsequent ones were, 
from the first Russian edition (1933), entirely the responsibility of 
Professor Smirnov. 

In most texts on the methods of mathematical physics algebraic 
methods play a minor role compared with methods based on the 
theory of functions. This is not so in Professor Smirnov’s scheme. In 
this first part of Vol. IIL a full account is given of the two branches 
of modern algebra — linear algebra and the theory of groups — which 
are most frequently used in theoretical physics. There is a detailed 
treatment of the theory of determinants and matrices and of quadra- 
tic forms including all the results necessary for an understanding of 
the concepts of functional and Hilbert space. The second part is devoted 
to a full account of the basic theory of groups and of the linear repre- 
sentations of groups. Novel, in a first course on algebra, is the inclusion 
of the elements of the theory of continuous groups. 

This volume is quite obviously of interest to applied mathemati- 
cians and theoretical physicists but its claims as providing material 
for a first course in abstract algebra for students of pure mathematics 
should not be disregarded. 

I. N. SNEDDON 


sii 


PREFACE TO THE FOURTH RUSSIAN EDITION 


In the present edition the third volume has been divided into two 
parts in connection with the addition of new material. The first part 
contains all material referring to linear algebra, to the theory of 
quadratic forms, and to the theory of groups. I was greatly assisted 
in compiling the additional material by D. K. Faddeyev. He was 
partly responsible for the clarification of the simplicity of rotation 
and Lorentz groups, for the presentation of the material referring 
to the formation of groups with given structural constants and to 
integration over groups [70, 81, 87, 88, 89, 90]. I am very grateful 
to him for his assistance. 

VY. Smignov 


a ee 


i ideas Gsadtrens 
fy ( Stl s ar iratia 


oe. 


HIS 


: 
deta UD I 


VAL 


eee 


DETERMINANTS. THE SOLUTION OF SYSTEMS 
OF EQUATIONS 


§ 1. Properties of determinants 


1, Determinants, We shall start the present section by taking the 
simple algebraica] problem of the solution of a system of first degree 
equations. This will lead us to the important concept of determinant. 

We first consider some simple, particular cases. A system of two 
equations with two unknowns may be written: 


04,2 + Ayzt, = by, 

Az1Xy + Ageity = by. 
The coefficients a, of the unknowns are distinguished by two 
subscripts, the first indicating the equation in which the coefficient 


occurs, and the second showing with which unknown it is associated. 
We know that the solution of the system is 


piss Bi2e, — Gindy x= Aryb, — Bin, 
i oa eae Se ee 
Oy {Bag — By 289, 1229 — Bg, 
We next take three equations with three unknowns: 
Qy,%, + Ayo%_ + Aygry = b,, 

Gq Xy  AgoT, + Agghs = dy, 

Agi, + Agr%e + Aggts = by, 


the same notation as above being used for the coefficients. We re- 
arrange the first two equations as 


QyyX, + Ayo%, = b, — AQygtg, 


By )X1 + Ao%a = by — Aggr3. 
2 


2 DETERMINANTS. THE SOLUTION OF SYSTEMS OF RQUATIONS {1 


We solve these with respect to the unknowns x, and 2, in accordance 
with the previous formula: 


pin as (by — &y2%g) Gog — 21s (Dy — Agg%s) , 
1 A128 22 — Ayelgy : 

fae 211(b, — Gp3%3) — (By — 243%) Day 
2. = s 


By Boy — Bye 


On substituting these expressions in the last equation, we obtain 
an equation for the unknown z;; solution of this latter equation leads 
us to the final expression for this unknown as 


By pCbg + 190201 + 91091059 — 211020 ge — F128 9105 — Di Geyt gy (1) 


t= 
Dy Bye gg + Ay elhgg gy + 213918 g2 — yy By3F gq — Ay 2Ao1Fg, — By sFyeta1 


We must carefully examine the construction of this expression. 
We note first of all that the numerator can be obtained from the deno- 
mninator simply by replacing the coefficients a;; of the unknown to be 
determined by the constant terms 6;. This now leaves us with the 
elucidation of the rule for forming the denominator, which contains 
no constant terms and is made up solely of coefficients of the system. 
We write down the coefficients in the form of a square array, which 
preserves the order in which they appear in the system: 


; Fy, Aya, Ay 


@4, B22, Agi]. 
bE 


t Az), Dg, Bye i} 


Our array consists here of three rows and three columns; the 
numbers a; are known as its elements. The first subscript shows the 
row in which the element appears, and the second subscript the 
column. We now write out the denominator of (1): 


142033 + Ay2Qy303, + By3%91go — Ay) Byg2gq — 249M9\Ag3 — Dy 9%92%g,. (3) 


It can be seen to consist of six terms, each of which is the product 
of three elements of array (2), with one element taken from each 
row and one from each column. The products have the form: 


Dy 2923, > (4 ) 


where p, g, 7 are the integers 1, 2, 3 arranged in some definite order. 
Thus, the second, as well as the first subscripts form a set of the 


'j DETERMINANTS 3 


integers 1, 2, 3, and to obtain all the terms of expression (3), we have 
to take all the possible orders of the second subscripts p, g, 7 in (4). 
There are clearly six possible permutations of the second subscripts: 


1258 2. Bly Be 4p 27° Ay 3, Sy 23-4. Se By 2 a. HS) 


with the result that we obtain all six terms of the expression (8). 
But some products (4) appear with the plus sign in expression (3) 
and others with the minus sign, so that we finally have to indicate 
some rule for the choice of sign. We notice that the products (4) with 
the plus sign have second subscripts forming the following per- 
mutations: 

1, 2, 3; 2, 3, 1; 3, 1, 2, (5,) 


whilst the products with the minus sign have second subscripts in 
the permutations: 


1,3, 2; 2,1, 3; 3, 2,1. (5,) 


We now indicate how permutations (5,) differ from permutations (5,). 
We refer to the fact that a larger number comes in front of a smaller 
as an inversion in the permutation, and we calculate the number 
of inversions in permutations (5,). There is no inversion in the 
first permutation, i.e. the number of inversions is zero. We pass 
to the second permutation and compare the magnitude of each 
number appearing in it with all those that follow. We see that there 
are two inversions here: the 2 comes in front of the 1, and the 3 
comes in front of the 1. It may readily be seen in the same way that 
the third of permutations (5,) contains two inversions. In short, 
all the permutations (5,) can be said to contain an even number of 
inversions. On carrying out a similar investigation of permutations 
(5,), we see that they all contain an odd number of inversions. 
We are now able to formulate a rule of signs in expression (3): the 
products (4) appear in (3) without change where the number of inver- 
sions in the permutation formed by the second subscripts is even. 
In contrast, the products appear in expression (3) with the minus sign 
when the permutation formed by the second subscripts contains an 
odd number of inversions. Expression (3) is known as a determinant 
of the third order, corresponding to the array of numbers (2). The 
above discussion can now be easily generalized for the case of a 
determinant of any order. 


4 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS {i 


Suppose we have n* numbers, arranged in a square array with n 
rows and 7 columns: 


21» Gree +++» San 

|| oy, Bon, +--+ Gon 

| (6) 
q ° 

1, 2ny, Any --+s Bpy, 


The elements a; of this array are given complex numbers, whilst 
the subscripts 7 and & indicate that: the number ay, stands at the 
intersection of the ith row and kth column. We form all the possible 
products of elements of array (6) such that they contain one element 
from each row and one from each column. These products will have 
the form 

21 p F2p ++ > Imp ? (7) 
where 7, P, ---;Pn are the numbers 1, 2,...,7, arranged in a 
certain order. To obtain all the possible products of form (7), we have 
to take all the possible permutations of the second subscripts. We know 
from elementary algebra that the number of these permutations is 
equal to factorial n: 


1-2-3 ...n=n!l 


Each permutation will have a certain number of inversions 
compared with the original permutation 


D225 By cowrdy Tee 


The products (7) with second subscripts, forming a permutation 
with an even number of inversions, are taken without change, 
whereas we write a minus sign in front of the products in which the 
permutation of second subscripts has an odd number of inversions. 
The sum of all the products thus obtained is called an nth order 
determinant, corresponding to the array (6). This sum will evidently 
contain m! terms. The definition we have given may readily be express- 
ed as a formula. We shall use the following notation. Let p,, 
Po -++> Pn be a permutation of the numbers 1, 2, ...,. We denote 
the number of inversions in this permutation by the symbol 


[Pas Par -++s Pal- 


Then the definition given above of the determinant, corresponding 
to array (6), can be expressed as follows, the array being written 


1) DETERMINANTS 5 
between vertical lines in order to indicate the determinant: 


yy, Aya --+, & 


Oni Bees ace; @ 2 
: = (A 1) Pu PevwePal yy Gopy «++ Gap (8) 
(Pa “ay. + 0 vs) 


Ony, Bnay --++ Ann 


The summation extends over all the possible permutations of the 
second subscripts, i.e. over all the possible permutations (p,, p,, ..., 
Pn). When referring to the array as such, and not to the determinant 
formed from it, we write it between double vertical lines. 

It may be noticed that the factors in each product in expression (3) 
have been arranged so that the first subscripts form the basic per- 
mutation 1, 2,3, and hence all our remarks have been concerned 
with the permutations formed by the second subscripts. Instead, 
we can write the factors in the products so that the second subscripts 
always appear in increasing order, and (3) becomes in this case: 


Oy Agggg 1 Og12 Mog TF Gz, Fg913 — 2y14g903 — Ay 433 — Ag1%29% 3. (9) 

The first subscripts here give all the possible permutations p, q, r, 
and exactly the same rule of signs as above can be stated for the 
terms of (9), though now with respect to the first subscripts. This 
leads us to consider, along with sum (8), the analogous sum: 


(—])[PuPo- Pala a.4...@ 


pri Fp ,2 (10) 


(Pay Par -- ++ Pr) pe 


This latter sum clearly consists of the same terms as sum (8). 
We shall see later that its terms have the same signs as in (8), ie. 
sum (10) coimcides with sum (8), as in the case n = 3. 

We go back finally to the case n = 2. Here the array has the form 


yy, Ae 


G23, Ao2 
and (8) gives the following expression for the second order determinant 
corresponding to the array: 


Ay, Aro 
= Gy;42_ — Apolo - 


G21, Ao2 
It is clear from the above that an account of the properties of 

determinants requires a closer acquaintance with the properties of 

permutations and these form the subject of our next section. 


6 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [2 


2. Permntations. Suppose we have any 2 elements arranged in a defi- 
nite order. This is referred to as a permutation of the elements. We prove 
first of all that there are n! different permutations. This is obvious 
with 2 = 2, since two elements can give two different permutations. 
With n = 3, the result follows directly from the list of permutations 
(5), where the numbers I, 2, 3 play the part of elements; we can 
easily verify that (5) gives all the possible permutations of these 
three elements. We prove our assertion for any integer n by induction. 
We assume that the assertion is true for a given m and show that it 
is then valid for (x + 1) elements. Thus, having assumed that n 
elements give n! permutations, we consider (z + 1) elements which 


we shall write as 
C1, C,, eens Cn4i- 


We start by considering the permutations in which C, is the first 
element. In order to obtain all these permutations, we must write C, 
in the first position, then write down all the possible permutations 
of the remaining n elements. The number of these latter permutations 
is equal to n! by hypothesis, and hence, the number of permutations 
of elements C;, starting with C, is equal to n! Similarly, the number 
of permutations of elements C, starting with C, is likewise equal 
to 21 In general, the number of different permutations of elements 
C,, will be altogether 


ni-(n+1)=1-2-...2. (ntl=(n4+)D), 


which is what we required to show. 

We can naturally assume that our elements are taken as the integers 
starting from unity, and we shall confine ourselves to this case in 
future. We define a transposition as an operation in which the positions 
of two elements in a permutation are interchanged. \t follows at once 
that we can obtain from a given permutation any other permutation 
by carrying out a certain number of transpositions. For instance, let 
us take the two permutations of four elements 


1, 3, 4,2; 2, 4,1, 3. 


We can pass from the first of these permutations to the second 
with the aid of the following series of transpositions: 


1, 3, 4, 2 + 2, 3, 4, 1 + 2, 4, 3, 1 —> 2, 4, 1, 3. 


Three transpositions have been needed here in order to pass from 
the first permutation to the second. If we had used different trans- 


2] PERMUTATIONS 7 
positions, our passage from the first permutation to the second 
would have been by means of a different series; in other words, the 
number of transpositions required for passage from one permutation 
to another is not strictly defined. The essential fact that we want 
to show is that the different numbers of transpositions that may be 
used are either all even or all odd for two given permutations. This 
may be explained by bringing in the idea of an inversion which we 
used in the previous article. Let us take permutations of the n elements 
1,2, ..., 2. We call 


TOY 305 Se he (12) 


the normal order, where the numbers appear in increasing order. 
We say that there is an inversion in a given permutation when 
two elements appear in it in a different order to that which they 
have in the normal order, in other words, when a larger number 
comes on the left of a smaller number. We define even permutations 
as those in which there is an even number of inversions, whilst odd 
permutations are those where the number of inversions is odd. The 
following theorem is fundamental for what follows. 

A transposition changes the number of inversions by an odd number. 

We take the permutation 


a, b, ..., kb, ..-, p, ..-, 8 (13) 


and suppose that the elements & and p are transposed. After the 
transposition, the arrangement of k and p with respect to the elements 
to the left of k and the right of p remains as before. The only change 
is in regard to the elements of the permutation lying between k and p, 
except, of course, that the arrangement of k and p with regard to 
each other is likewise changed. Let us work out the total change in 
the number of inversions. Let there be altogether m elements lying 
between & and p in permutation (13), and suppose that these inter- 
mediate elements supply a normal orders and # inversions in respect 
to k, and similarly a, normal orders and §, inversions in respect to 
p. We obviously have: 


a+f=a,+f,=m. (14) 


As a result of the transposition, a normal order becomes an 
inversion and vice versa, or to put the matter more precisely, 
if element & was in normal order with regard to a certain inter- 


8 DETERMINANTS. THE SOLUTION OF SYSTEMS OF BQUATIONS [2 


mediate element before transposition, it becomes inverted after 
transposition and vice versa, whilst the same is true for element p. 
Thus the total number of inversions for elements & and p in regard 
to the intermediate elements was 6 + §, before transposition, and is 
ata, after transposition, i.e. the change in the number of inversions is 


y=(a+a,)— (6+ B,). 
We can use (14) to re-write this as: 


y = (a+ a;)—(m—a+m—a,)=2(a+a,—™m), 
whence it follows immediately that the number y is even. We still 
have to take into account the change in the arrangement of elements 
k and p in regard to each other. If they were in normal order before 
transposition, they are afterwards inverted, and vice versa, i.e. the 
change in the number of inversions is unity here; hence the total 
change in the number of inversions due to transposition must be an 
odd number. 

We notice some corollaries of the theorem. 

ConoLtuary I. If we write down all the 2! permutations and trans- 
pose two definite elements in each, say elements | and 8, all the even 
permutations become odd permutations and vice versa; whilst in 
general the total aggregate of n! permutations is again obtained. It 
follows at once from this that the numbers of even and odd permuta- 
tions are the same. 

CornoLiary II. Every permutation can be obtained from the normal 
order by means of transpositions. It follows directly from the theorem 
that even permutations are obtained by carrying out an even number 
of transpositions on the normal order, and odd permutations by carrying 
out an odd number of transpositions. 

CoroLtuary III. The choice of normal order is entirely arbitrary. 
Any order other than (12) could have been taken as normal, in 
which case, of course, the definition of inversion would require a 
comparison with the new normal order. It may readily be seen that 
if we take any even permutation as normal instead of (12), even 
permutations still remain even, and similarly, odd permutations still 
remain odd. On the contrary, if we take any odd permutation as 
normal, even permutations become odd, and odd class permutations 
become even. 


2) PERMUTATIONS 9 


For instance, if we take 2, 1, 3 as the normal order in the six per- 
mutations of the elements 1, 2, 3, we have as even permutations: 


2, 1, 3; 1, 3, 2; 3, 2, 1. 


The second of these permutations contains two inversions: the 
1 stands in front of the 2 and the 3 is in front of the 2, whereas in 
the normal order the 2 precedes the 1 and also precedes 3. The 
odd permutations are: 


We have one inversion with respect to the normal 2, 1, 3 in the first 
of these permutations, viz. 1 precedes 2. 

On taking into account what has been said above, we can state 
the rule of signs in expression (8) as follows: we write a plus sign 
in front of a product tf the permutation of its second subscripts is even, 
and a minus sign if the permutation ts odd, the order 1, 2, ..., being 
taken as normal. 

We now elucidate one of the fundamental properties of determinants. 
We interchange the first and second columns in the array producing 
the determinant. The numbers written above as a, will still be denoted 
by the same letter with the same subscripts. Our interchange gives 
us, instead of array (6): 


i| G22" Gays Gig - ++» Gan | 
i] 222» Bars Gog ++ +> an 
| (15) 
Digs Cites as Seeies iea 


We can now use the definition expressed by (8) to form the determi- 
nant corresponding to array (15). The columns in this array are 
enumerated by the following permutation: 2,1, 3,...,, and this 
must be taken as the normal order. It has been obtained from the 
previous normal order by means of one transposition, and therefore 
it was previously odd. Hence permutations that were previously odd 
become even with the new choice of normal order, and vice versa. It 
follows that the determinant corresponding to array (15) is the sum of 
the same terms as appear in (8) but, due to the change of the 
permutations of the second subscripts from odd to even or vice 
versa, all the terms now have the opposite signs, ie. the magnitude 


10 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [2 


of a determinant changes sign on interchange of two columns. We have 
proved this property by interchanging the first and second columns. 
Exactly the same proof applies for the interchange of any two 
columns. We have, for instance: 


The second determinant is obtained from the first by interchange 
of the second and third columns. 

We consider a further property of determinants. A typical term 
of sum (8) is 


(-— 1) [Pu Pes vos Pa] Dp, 22p0 ees Bn pn’ (16,) 


We can bring the second subscripts into normal order by changing 
the order of the factors, but the first subscripts will now form some 


permutation g,, 9, -.-, Qn, the expression being now written as 
(1a ks Pal Gp Gig 3 eas (16,) 


The transition from (16,) to (16,) requires a certain number of trans- 
positions of the factors. Each transposition implies a simultaneous 
transposition of both the first and the second subscripts. If the number 
of transpositions needed for passing from (16,) to (16,) is even, this 
means that the permutation p,, po, ---, Pn is even, since it becomes 
1, 2,..., with the aid of an even number of transpositions. It can 
therefore be obtained similarly from the normal order with an even 
number of transpositions. But now the permutation q,, q@, --+)4n 
must likewise be even, since it is obtained simultaneously from the 
normal order with the aid of the same even number of transpositions. 
Similarly, if p,, p,, ---, Pn is odd, %, Y, ---» Qn i8 too. It follows from 
this that (—1)!?i Ps +» Pel — (1) [4% -»%] and we can therefore write 


(— 1) [Pt Pas +++» Pal Grp, Dap, -~- Inp, = (— 1) [41 G25 +++ Gn] Bit Sana - + Tagn- 


Hence, if we compare corresponding terms in sums (8) and (10), it 
will be seen that the sums are precisely the same. The rows play 
the same role in sum (10) as the columns in sum (8). These remarks 
lead directly to the result that, if ali the rows and columns change 


3] FUNDAMENTAL PROPERTIES OF DETERMINANTS 11 
places in an array without changing their order, the value of the determi- 


nant is unchanged. 
For example, the following two third order determinants are equal: 


|2. 8, 5 (2,7, 21 
7, 0,1 =|s, 0,1, 
2,1,6| (5,1, 6: 


3. Fundamental properties of determinants. I. We first of all state 
the property just proved — the value of a determinant is unchanged 
on replacing the rows by the columns. Everything below that is proved 
for columns is likewise valid for rows, and vice versa. 

II. We saw in the previous section that the interchange of two 
columns merely changes the sign of a determinant, the same being 
true for rows, i.e. on interchange of two rows (columns) the determinant 
merely changes sign. 

Ill. If a determinant has two identical rows, on the one hand 
their interchange leads to the same determinant, whilst on the other 
hand, by what has been proved, the determinant changes sign. Thus, 
if we write the value of the determinant as 4, we have 4 = —A, 
or 4=0. In other words, a@ determinant, with two identical rows 
(columns) is zero. 

IV. .t linear homogeneous function of the variables 2,, 2, -- +, In 
is defined as a first degree polynomial in these variables with no constant 
term, i.e. it is an expression of the form . 


P(X, La, ---) Ly) = Ay, TAX... TAT» 


where the coefficients a; are independent of the z,. Such a function 
has two obvious properties: 


@ (eis: bes ng bE RP (Bi Ba; 2s 5K q) 5 
P (Hy Yys Vat Yor + + +s Bab Yn) =P (ys Lay ~~ 1p) FP (Yys Yor «+0 Yn)- 


The latter property remains valid for any number of added terms. 
On returning to formula (8), we see that each term of the sum contains 
one and only one element from each row as a factor. It follows from 
this that a determinant is a linear homogeneous function of the elements 
of any given row (or of a given column). 

Consequently, if all the elements of a row (column ) contain a common 
factor, it can be taken outside the sign of the determinant. 


12 DETERMINANTS, THE SOLUTION OF SYSTEMS OF EQUATIONS [3 


As indicated above, the value of the determinant corresponding 
to array (6) is generally denoted by 


| @a1> Bo, ---» Hr 
Go, 2y9, +++ Aon 
J aes: SS aie og, Ge 

lana, Anes » Bann i 

or, more briefly, by 
“bagel (ie bSH1, Qe heey Me 
The property just proved can be written in a particular case as say 
kay, ayy, kays @y1 Fy, As | 


x1, A9, ag | =k | da, Ayo, Ay} - 
| Qs, Gaz, Aaa | igi» Ago, Gag 
The second property of linear homogeneous functions leads to the 
following property of determinants: if the elements of a row (column) 
are the sums of a like number of terms, the determinant is equal 
to the sum of the determinants in which the elements of the row (column) 
in question are replaced by the individual terms. We have, for example: 


la, b, e+c’! ja, b, ¢ | a, 6, c'! 
d,e, ftfi=|d,e f |tide, f’. 
g, hy i+7 gh,i| |g, kh, | 


We note a further obvious consequence of linearity and homogeneity. 
If all the elements of a given row (column) are zero, the determinant 
vanishes. 

V. If we strike out the ith row and kth column (intersecting in a,,) 
from array (6), we are left with (xn — 1) rows and (m — 1) columns. 
The corresponding (n — 1)th order determinant is called the minor of 
the original nth order determinant, corresponding to the element a,x. 
If we write this 4;,, the product 


An =(— ity (17) 


is called the cofactor of the element a;,. We now show that these 
cofactors are the coefficients of the linear homogeneous function 
referred to in an earlier property, i.e. we have for any ith row: 


Aa = Aya + Apti, + ae + Aj fin @=1, 2, aang n) (18) 
and for any kth column: 
A = AyjOyy, + Ago, +... FAnadry, (F=1, 2,..., 2), (19) 


3] , FUNDAMENTAL PROPERTIES OF DETERMINANTS 13 


where A is the value of the determinant. We have to show, in other 
words, that if we collect all the terms in (8) containing a given element 
Qi, the coefficient of the element will be its cofactor A;,, as defined 
by (17). We write this coefficient to start with as B;, and observe 
that it consists of the sum of the products of (n — 1) elements, 
elements of the ith row and kth column being no longer included 
in these products. 

We first take the case i = k = 1 and write out the terms of sum 


(8) containing a,,: 


Ay > (— 1) [4s Pa e+ | Aan, te Dnpn” 


(Pay -- +1 Pr) 
The summation must extend over all the possible permutations 
Po) Py ++ +1 Pn Of the numbers 2, 3, ..., 2. The first element unity in 


the full permutation 1, p,, ..., pp, is in the normal order as regards 
the remaining elements, so that we have for the number of inversion: 


(1, Po, ---, PoaJ=[P2,---) Pals 


the permutation in which the numbers appear in increasing order 
being taken as normal in both cases. We thus have the following 
expression for the coefficient of a,,: 


By = gf  (— 1) lhe Pela, Oap 
8 (Par Par ++ Fn) a = 
This sum comes within the definition of determinant, except that, 
by comparison with the original determinant, the first row and first 
column are missing. Hence it is clear that 


! {Bay en 4n =(— q)trt Ay m7 Ay), 


Le. our statement is proved for i= & = 1. We turn to the case of 
any ¢ and k. We interchange the ith row in turn with higher rows 
until it arrives at the position of the first row. This requires (7 — 1) 
interchanges of rows. Similarly, the kth column is brought to the 
position of the first column by successive interchanges. These inter- 
changes move the element ay, upwards and to the left to the position 
of a,,. The row characterized by 7 and the column characterized by k 
appear in the first position, whilst the order of the remaining rows 
and columns remains unchanged. The result obtained above shows 
that the coefficient of a;, after these interchanges is equal to 4x. 
But we have employed (¢ — 1) + (k — 1) interchanges of rows and 
columns in pairs, and each such interchange adds a factor of (—1) 


14 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [3 


to the determinant, i.e. we have added altogether the factor 
(— 1)@-4)+ &-1) — (— 1X 


and the final expression for the coefficient By is therefore: 


A; : 
B= yee = (— 1)" An = Ay 

which is what we wished to prove. Thus we have proved formulae 
(18) and (19). If we successively replace the elements of the ith row 
in the determinant A by the numbers ¢,, ¢, ..., ¢n, whilst the remain- 
ing rows are unchanged, the factors A;, in (18) will be unchanged, 


and the value of the new determinant will be 


A’ = Ajycy + Apey + eee + Ain ch. (20) 
In particular, if we take ¢,¢,, .-.,¢, equal to the elements aj, 
Qjg, --+,@jn Of the jth row, where 7 #7, the determinant J’ will 


have two identical rows, the ith and the jth, so that it vanishes: 
A’ = 0, ie. 


Aj + At +... + Ain®jn = 0 (t49). (21,) 
Similarly for columns: 
Ay Ay + Agfa T .~. + Anan = 0 (k#]). (21,) 


Expressions (19) and (21) lead us to a property of determinants 
that is important later. 

If we multiply the elements of a row (column) by their cofactors 
then add, we get the value of the determinant. On the other hand, if we 
multiply the elemenis of a row (column) by the cofactors of the corres- 
ponding elements of another row (column) then add, the sum its zero. 

VI. We add the elements of the second row, multiplied by a factor p, 
to the elements of the first row of the determinant 4. The elements 
of the first row become 


a,+ Ph, (s=1,2,..., 2), 
and by virtue of property IV, the new determinant is the sum of 
two determinants: the original determinant and a second determinant 
in which the first row consists of the elements 
Pa,, (S=1, 2,..., 2), 


whilst the remaining rows are the same as in 4. On taking p out of 
the first row, we get identical first and second rows, and the second 
determinant, therefore, vanishes, i.e. in general, the value of a deter- 


3] FUNDAMENTAL PROPERTIES OF DETERMINANTS 15 


minant is unchanged if the elements of one row (column), multiplied 
by a constant factor, are added to the corresponding elements of another 
row (column). 

We now introduce some notation for future use. Given the square 
array of numbers (6), let / be a positive integer not greater than n. 
We shall denote the determinant of order /, consisting of the rows 
of (6) numbered p,, p., .-.,p; and the columns of (6) numbered 
Gis Gar ++ 02 Qh a8 follows: 


Garg Lig ++ +> 2pyr 
A [Pu Par +> > a) we | 7 pag? 2 pegs? > + - » Ang (22) 
Gi» Jo -+-s Tr co 
iFpqy Apa +++» Tog 


With this, a number a itself is generally referred to as the first order 
determinant corresponding to a, i.e. A(?) = apg. The sequences of 
positive integers p,, p,, ..-, Py and gy, >, .--,g need not necessarily 
be arranged in order of increasing p, and q;. If the numbers are in 
fact in increasing order in both sequences, determinant (22) is called 
a minor of order Z of determinant (8). The determinant (22) is obtained 
from (8) by striking out (x — 1) rows and (n — #) columns. Let the 
rows and columns struck out be characterized by the numbers 7, 
To, -++,Tn—y and %,%,-..-,8,-, arranged in increasing order. The 
minor 


Ti, Tar ose =a) 


$1, Soy 6-5 Spy 


A 


is known as the complementary minor to (22), whilst the expression 


(= 1)Pit Pat tA F Gt tH 4 i To, 20-3 Tp (22,) 


Sys 895 20-5 Spy 


is known as the cofactor to minor (22). For a single element a;,,, 
this definition of cofactor is the same as the previous definition (17). 
We write the cofactor (22,) as 


A’ ye P2, Soy, Pr\ 2 
Gy Ya ---s Oi 
It is fully defined on specifying determinant (22), ie. on specifying 


the numerical sequences 7, p2, -.-, p; Of its rows and q,, qd, ---;Q1 
of its columns. 


16 DETERMINANTS. THE SOLUZION OF SYSTEMS OF EQUATIONS [4 


Let us fix the numbers of the rows. The value of the determinant J is evid- 
ently a homogeneous polynomial of degree J in the elements of these rows, 
and it can be shown to be given by the expression (Laplace’s theorem): 


Be pact ee ane (23) 
WA 4:---< TE we Yar sees Gis Yor -++> GY 
where the summation is carried out over all the possible increasing sequences 
of 91, 2, ---, q taken from the sequence I, 2, ..., m. The number of terms in 
sum (23) is equal to the number of combinations of f from n elements: 
c! _n(n—1)...(n—1+4+)) 
ae | een 

since the q, are only taken in increasing order when forming the sum. For/=1, 
we have A(t) =a,),,, and (23) becomes (18) with 4 = 7p,. It is easy to de- 
rive an expression analogous to (23) for the expansion of 4 in the elements of 
any selected columns. No use will be made of (23) and the proof is omitted. 


4, Evaluation of determinants. The evaluation of a second order 
determinant is extremely simple. By (11), we merely write down the 
array 


and take the product of the elements on the full diagonal with its 
own sign, and on the dotted diagonal with the reverse sign. 

We turn to third order determinants. We wrote down the expanded 
form in (3). It may easily be verified that this can be obtained as 
follows: we write out the array giving the determinant then write 
out the first and second rows again underneath. This gives us an 
array with six diagonals, with three elements on each. We take the 
products of the elements on full line diagonals without change, 


UP ae 22 Q93 


and those of the elements on dotted diagonals with the minus sign. 


5] EXAMPLES 17 


The sum of the six products gives the determinant (Sarrus’ rule). 


There is no generalization of this rule for higher order determinants and 
we need a different procedure to shorten the working. For instance, property 
VI of the previous section can often be used with advantage. We shall illustrate 
this with an example. We take the fourth order determinant 


| 3,5, 1,0! 
| 
} 21,4, 5) 
A=i j 
i 1,9, 4,2 
!—3, 5,1, 1] 


We multiply the third row by (—2) and add it to the second; further, we mul- 
tiply the same row by 3 and add it to the fourth whilst subtracting from the 
first. By the property mentioned, we arrive at a determinant of the same value 
as that written above, but now having three zeros in the first column: 


(9, 16, — 11, — 61 
0, —13, — 4, 1 
| Pre Sa oe 


lo, 26, 13, 7; 


A= 


This gives us, on expanding by elements of the first column in accordance 
with equation (19): 


216,10 28-4 

I 

As 43s -as 
26, 13, 7 


We multiply the third column by 4 and add it to the second, then multiply the 
same column by 13 and add it to the first. We thus get: 


— teeta ee _ | 
A= 0, Q, lj=— |= 94x41 — 35x 117 = — 241. 
117, 41, 7 | 117, 41 | 

5. Examples. 1. Let it be required to find the volume of a parallelepi-. 
ped whose sides are the vectors A, B, C, having the same vertex as 
origin. We know from [H, 105] that the required volume is given by 
the scalar product of A and the vector product (BxC): 


V =A(BXxC). (24) 


The volume is obtained with the plus sign if A, B, € have the same 
Orientation as the coordinate axes, and with the minus sign if the 
Orientation is different. The components of the vector product are 


BOBO, BOBO, BC, BCs 


18 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS i 
and the scalar product of (24) is therefore 

A,(B,0, — B,O,) + 4, (B.C, — B,C,) + 4,(B.C, -- ByC,). 
This sum may easily be seen to represent a third order determinant, i.e. 


[Ae B, C, 
V=!A,, By C,|. (25) 


|4,, By C, 


The vanishing of this determinant shows us that the volume is zero, 
in other words, that the three vectors are coplanar, i.e. they lie in 
one plane. If we interchange two rows (columns) in the determinant, 
say the first and second, this means that the order of vectors A, B, € 
is changed to B, A, C; if the vectors had the same orientation as the 
axes in the previous sequence, they now have a different orientation, 
and vice versa. The value of the determinant correspondingly changes 
sign. 

Similarly, if we take two vectors (A,, Ay) and (B,, By) in the zy 
plane, the area of the parallelogram formed by them is equal to the 
second order determinant 


A,, B, 
Ay, By 


P= 


We now consider a triangle with vertices 
My (%,4:), Mz (%2, Y2), AL (%2, ya). 
We take the vector A = M7, if, and B = M,M,, with components: 
MM, (x2 — %, Y2— 91), MyM3 (43 — 2%, Y3 — Yi); 
whilst the area of the triangle can be written as 
pat yet a Magee 
2 ly. Ip Ys— 4} 


It may readily be shown that the second order determinant can 
be replaced by a third order determinant so that the above expression 
becomes 


| &, Xy, Ws 
Pe 
= 3 | ¥v Yo ¥s}- 


ete tax Aol 


5] EXAMPLES 19 


The vanishing of this determinant gives the condition for the three 
points M,, M,, M, to be collinear. In other words, the equation of 
the straight line passing through two given points (z,, y,) and (2, 4) 
can be written as 


lx, Xy, 


ly. Yp Y2 
Aad 


=0. 


II. The equations of some loci may easily be found by using de- 
terminants. Suppose, for instance, we are seeking the equation of 
the circle passing through three given points (7, 4), (2, Yo), (23s Ys)- 
The equation may be readily seen to be obtainable with the aid of 
a fourth order determinant, as follows: 


[Pry ary wt we m3 US 

i ee eo (26) 
1 I Yas Ys i 7 
ee 1, 1, Lg 


On expanding by the first column, this equation is seen to be in 
fact of the second degree, with the same coefficient for z? as 7? and 
with the term in zy missing, i.e. (26) is the equation of a circle. Finally, 
if we substitute =a, and y = y, in the equation (k = 1, 2, 3), 
the first column becomes identical with one of the others and the 
equation is satisfied, i.e. the circle actually passes through the three 
given points. It should be noted that, if the three given points are 
collinear, the coefficient of (x? + y’) in equation (26) vanishes, so that 
the equation corresponds to a straight line and not a circle. 

Similarly, with axes OX, OY, OZ in space, the equation of the plane 
passing through three given points (2, ¥,, 2), (Zo, Yo 22), (Lg Yar Zs) 
can be written as a fourth order determinant: 


x, vy, Lay ZX 
he Yu Yo ¥3 
(2, 2, 2, 2g 
| 1, 1, 1, 1 


=0, (27) 


If the three given points are collinear, equation (27) reduces to the 
identity 0 = 0. 


20 DETERMINANTS, THE SOLUTION OF SYSTEMS OF EQUATIONS {5 


III. We consider the determinant D, of order n, each row of which 
consists of powers of a certain number, starting with the (rn — 1)th 
power and down to and including zero: 


aft att, 2, 1! 
Ge hey pe Sy Xo; Lj 

De cae (28) 
a 


We have with n = 1 and n = 2: 


D,=1; Dy = 2%, — 2. 


To expand the determinant D,, we replace the number 2, in its first 
tow by the letter z. We get the determinant: 


x, 2, 1 


D(z) = a3, Boy 1 |. 
23, Z3, 1 


On expanding by the first column, we see that D,(z) is a second 
degree polynomial in z. If we substitute z= 2, or z= 2, in the 
determinant, the first row becomes the same as the second or third 
and the determinant has zero value, i.e. the quadratic form D,(z) 
has roots z, and z, and may be written as 


Dg (z) = Ag (x -— 2) (e — 23), 
where A, is the coefficient of z*, i.e. the cofactor of the element 2? 


appearing at the top left-hand corner of D,{(z). It follows from this 
that 


a, 1! 
eA 


ic. A; is the determinant D,, consisting of the numbers z, and 73. 
Finally: 


A, = 


Dag (x) = (%_, — %) (¥ — Xq) (u — a5). 


On substituting z = x,, we obtain an expression for D, as the product 
of three factors: 


Dy (2 — %2) (2, — 2) 
(%_ — 3). 


5] EXAMPLES 21 


Having found D,, we can find an expression for D, in precisely the 
same way. It is the product of six factors: 


(2%, — Zq) (7 — %) (%, — Xq) 
D= (22 — Xg) (Lp — %4) 
(%3 — %4) 


Similarly, for any n, we find the following expressi-n for D,, which 
is generally known as Vandermonde’s determinant: 


(x, — 2) (21 — 2%) .. - (4 — Z_) 
D,= (2 — 23) .-- (%_ — Zp). (29) 
(pa = Xn) 


This expression has an interesting connection with the basic defi- 
nition of determinant. Any nth order determinant can be written: 


Zins Zn wees Ly i 
x; z wees & 
i “ln “2,n—1) 3 21 «| 
i (30) 
Tins Th,n—1r +++) tri ! 


We carry out the purely formal substitution of z;~' for each element 
2, As a result of this, determinant (30) clearly becomes Vander- 
monde’s determinant (28). An immediate consequence is the following 
rule for finding the sum giving the value of (30): we remove the 
brackets in expression (29) and replace 2; * in each of the resultant 
terms by 2;,; if a power of z;, is missing in a product term, we add 
the factor z,°, which after substitution becomes 2,,. It may be re- 
marked that this last rule can be taken as the definition of a determi- 
nant. 

IV. We consider an expression which will concern us later on: 


1 1 
fu — @, Qy2; Q13, Qin 
© Gays o2 + 2, G23, > ay, |! 
A(z)=' dy, G32, gg 7%, ---5 Agn (31) 


22 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS {5 


and expand it in powers of z; for this, we first re-write it as follows: 


| Oy +t, yg +0, ayg+0, ..., Oy +0 | 
| Gay +0, Gog 2, Gag +0, --., dan +0 | 

A(x) =! dg, +0, agp +0, gg +2, ..., Ag, +0 | (32) 
are neni es 
i @ny +9, Gyo +9, Gag tO, ..., Gan FZ 


Each column of this determinant is the sum of two terms and we 
can re-write it by means of repeated application of property IV above 
as the sum of 2” determinants, the columns of which contain no sums. 
If we strike out the second term in all the columns of (32), we get a 
term which does not include z, i.e. the constant term in the ex- 
pansion of A(z): 


' 
yy, Ao, , Giz 
G21, Boo, » Aon | 
res (33) 
| 
Qn» Bn aes Qn { 


On the other hand, if we strike out the first terms in all the columns, 
we get the leading term of polynomial A(z): 


We now consider the middle terms of the polynomial. Suppose we 
retain the second term in the p,, p;, .--, psth columns, and retain the 
first term in the remaining columns. Each p,th column (k = 1, 2,..., 
s) will now consist entirely of zeros except for the single element x 
on the principal diagonal, i.e. on the intersections of rows and columns 
characterized by the same number. On successively expanding the 
present determinant by the p,, p,.-., pth columns, we get the 
factor 2° from these columns and have to strike out the p,, p,, ..., psth 
rows and columns. The cofactor of the powers of z after each strik- 
ing out is precisely equal to the minor since the row and column 
struck out are both characterized by the same number. It follows 
that, for any choice of columns p, (k = 1, 2, ..., 8), our determinant 
will contain z* with a coefficient equal to the determinant of order 


6] MULTIPLICATION OF DETERMINANTS 23 


(n — 8) obtained from the original determinant (33) by striking out the 
rows and columns whose intersections consist of the elements a,,, 
Apps -++s Gpp, of the principal diagonal. We write this (n — s)th order 
determinant symbolically as 4p 5, --- p,- 
Pa +++ Da 

This is usually called the aincioal minor of order (n — 8) of the 
determinant A. Different choices of p,, p.,..-, Ps lead us eventually 
to the final coefficient of z° in the expression for A(z) as the sum of all 
the possible principal minors of order (x — 5), ie. 


A(z) =2" + 8,271.4 8,277? + ...4+8, 74+ S,, 


where S;, is the sum of all the kth order principal minors of 4. In parti- 
cular, 8, = A. The expression for the coefficient is explicitly 


(1,2, ....%) 


S,= Ap, a 
Pix Pax... < Pot PiPa ---Pa—é 
Bog Qargs? ae nate ) qa, 
_ Bag: Barge? ie wey Gq.a, (34) 
h<m<--<el. 2 ee : 
| aga,, da,9,; cee Gq, 


Here the summation extends over all the possible combinations 
of the & numbers q;, 2, ---, qx, taken in increasing order from among 
the numbers 1, 2, ..., . If the summation in (34) were simply over 
each subscript q; for all values from 1 to n, the integers would appear 
in the permutation q,, gq, ---, qx in all possible orders and not solely 
in increasing order. To be precise, every increasing sequence in the 
summation over all g; from 1 to n would have its place taken by k! 
permutations in all. We now observe that the magnitude of the deter- 
Mminant appearing in (34) is unchanged on interchange of any two 
numbers gq, and g;. Suppose, for instance, that g, and q, are inter- 
changed, then the first and second rows and columns are interchanged 
in the determinant which has no effect on its value. It follows from 
these remarks that, if the summation in (34) is simply over each 
of the gq; from 1 to n, each term in sum (34) will be repeated k! times, 
so that we can write the coefficient S, in the alternative form: 


1 ~ x Ne qs qs isis i 
8, = San .. YA ; (35) 
' ed dem ie Gor «++ U 


6. Multiplication of determinants. We derive a formula in this 
article for the product of two determinants of the same order. 


24 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [6 
Let us be given two nth order determinants: 


A= |ai.|7 (36,) 
and 
A, =|6,|. (36,) 


We form a new determinant, the elements of which are given by 


a= Saby 6 E=1,2,..., 2) (37) 


sel 


and we show that this determinant is equal to the product of deter- 
minants (36,) and (36,). We start with the case n = 2. On taking into 
account (37) and expanding the determinant in accordance with pro- 
perty IV of [3], we get: 

Oy3011 1 A202; 1012 + A2bo0 
Oyybyy + Gy3bo1, Ayyby2 + Ayyb20 
Ayyb 11, By2bo9 
Az1D11 Aq9d22 


Cyys Cq2 Ay3b3, Fyr012 


+ 


Cas C29 


Gy1011, Bpy49 


B99, 243545 
Bz9bo1, Bq1Dy9 


42001, By2592 
Agb01, F229 29 


On taking outside the common factors of the same columns, the first 
and fourth terms on the right-hand side yield determinants ; with 
identical columns and these vanish. On interchanging columns in one 
of the determinants that remain, we find that 


C44, ¢ a a 
1 32) bBo Ay1) Ap | + Diodoy By, By) = 
Cos Ca0 B21, Bop | G91 Boy 
Ay, F2 G11, Aye Qy11 Ay2 
= bi1bo2 =. Byobo = (811229 i? bi2091) = 
Qo1; Dee Gay, Aog Qo), Boo 
— |G 42] | Or me 
Aqy1 Zap | bo1, Doe 


which is what we had to prove. 
In the general case of order n, we have after applying property IV 
of [8]: 
O50 511) E15, Deyo» Sothees, 44s,n bs,n 


Ong 05.4: Bog Oegns «+ +2 Bas, 0 
n DSi S112 “252% s22* > “25a San 
eae nam ra 
1p CRP seep 


(38) 


Ons Psi Ons 542s eS Ons? san 


7) MULTIPLICATION OF DETERMINANTS 25 


where the variables of summation 5,, &, . .., 5; take the integral values 
1,2, ...,n. The terms of this sum can be written: 


| Gigs Dis +++ Ars, 
Foes Tass --- > Fasq 


b (39) 


195.2 ie Os,n 


ns Onsy> ne) ns, 


If any of the numbers 5,, &, ..., 5, are the same, there will be equal 
columns in the above determinant and it vanishes. We can thus confine 
ourselves to the terms for which none of the s, are the same, so that 
the sequence 5, 5, ..., 8, represents a permutation of 1, 2, ...,%. 
Twice multiplymg (39) by (—1)%»S»---+%] evidently leaves the ex- 
pression unchanged, so that we can write it as the product of two 
factors: 
By 511 Biggs -- ++ Bigg 


Dror Boggs - 25 & 


(— 1)!S Se o> Sa] ia id (-— 1) Br Su +o Sn] bade eee Bsn- 


a nse? Tnsy | 


a 


We transpose in the first factor so that the sequence §,, 8, ---, Sn 
becomes 1, 2,...,”. Each transposition of (—1)§"5*---+%] simply 
changes the sign of the determinant, whilst the factor as a whole 
remains unchanged. Hence we can write (39) as 


B41; ‘Ah, pcaaaks | Qin 
Goi; Qo, ste Gon, 

P (— 1) lS» seer Sa] 510542 eee Bsn , 
Onjy, Ong, ---) Ann 


and we now obtain, on returning to sum (38): 


leult= 4 (s. on (— 1)Be Sen Sel Be sBeua +++ Osyns 
where the summation extends over all the permutations s, 8, ..., Sn 
of the numbers 1, 2, ..., n. This latter sum is the determinant 4,, i.e. 
| Cx || = 44, which is what we had to show. Equation (37) amounts 
to the following: the elements of the ith row of determinant A are 
multiplied by the corresponding elements of the kth column of the 
second determinant then the products added. We know that the rows 
can be replaced by the columns in a determinant without changing 
its value. The above rule for multiplying rows by columns can there- 


26 DETERMINANTS. THE SOLUTION OF SYSTEMB OF EQUATIONS [6 


fore be replaced by three alternative rules, for multiplying columns 
by columns, columns by rows, and rows by rows. 
We can finally state the theorem as: let 


[@,{ and |b; 


be two determinants of any order n. 
We form a new determinant 


lcix|, 


whose elemenis are given by one of the following expressions: 


Cie= > FiPsir (40,) 
s=1 
Cin = & isbrs (402) 
$=1 
Cie = > OD er (403) 
sal 
Cin = > Ages (t,k=1,2,...,7). (40,) 


Sal 
The value of the determinant | ci, | is now equal to the product of | aj | 


and biz f 
Example. We consider, along with the original determinant 


A=|a;,| 


the determinant consisting of the cofactors of its elements 
| Aix]. 
We shall express the product | a, | - |A;, | as a further determinant, 


by multiplying rows by rows, in accordance with the above theorem. 
The new determinant has the following elements: 


n 
Cp = DS ists. 
s=1l 


We obtain from property V of determinants: 


7] RECTANGULAR ARRAYS 97 
i.e. 

| 4, 0, 0,...,0 

10, A, 0, eee 9 0 

[ain |+| Ail =| 0, 0, 4,..., 0! 


0,0,0,...,4 
or, as may readily be seen: 
lax |P}An[2—4", ie. AlAy,|2= 4". 
We have on dividing through by 4, assumed non-zero: 
|4i,)%= 4". (41) 
(0) 


If the elements aj, = aj,’ are such that 4 vanishes, we can find elements 
@;, a8 near as we like to a{?) such that A differs from zero. Equation (41) 
is valid for these ay, and on passing to the limit with a;,—> al), we 
see that the equation remains valid for a, =a, ie. for 4 = 0. 
If d and Aj, are written in terms of the elements a;,, (41) represents 


an identity with respect to the aj. 


7. Rectangular arrays. We shall encounter later on arrays in which 
the number of rows can differ from the number of columns. This 
more general type of array is exemplified by 


iu, Qo, --- 5 Ap 


ji@21 Fo29 --- + Don 


ox 
| 


‘2m 2mos--+ > mn 


(42) 


It contains m rows and n columns, where the m and 7 can be the 
same or different. On striking out rows and columns so that the 
number of each is the same, we can form determinants from the re- 
mainder. We say that these determinants enter into the constitution 
of array (42). The highest order that they can have is evidently equal 
to the lesser of the two numbers m and n, whilst the least order is 
unity, the first order determinants being in fact the actual elements 
of array (42). Suppose that all the determinants of a certain order 1 
appearing in the array are zero. It may readily be -seen that all 
the determinants of order (i+ 1) in the array are likewise zero. 
In fact, each determinant of order (1-+ 1) can be expressed as the 
sum of the products of the elements of a given row with the cofactors 


28 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [7 


of these elements. But the latter, except for sign, coincide with deter- 
minants of order J of the array, and are therefore all zero. Since all 
the determinants of order (/ + 1) are zero, it follows as above that all 
the determinants of order (7 +- 2) likewise vanish, and so on. Thus, 
if all the determinants of a given order appearing in array (42) vanish, 
all the higher order determinants of the array likewise vanish. 

This brings us to an important concept regarding array (42), the 
array being more commonly known as a matrix. The rank of matriz 
(42) is defined as the highest order non-vanishing determinant of the 
matriz, i.e. if the rank is k, there is at least one non-vanishing determinant 
of order k in the matriz, whereas all the determinants of order (k +- 1) 
vanish. 

Let us consider, along with matrix (42), the array 


ids Vanissa 
Dox, Bon, --- + Dom! 
21 22 2m (43) 
bas bro» BaF k bam 
consisting of n rows and m columns. We form the m? numbers 
n 2 
Cy = > Gost (2, k => 1, 2, eens mm). (44) 


sol 


The square array made up of the c,, is usually known as the product 
of rectangular arrays (42) and (48). 

We prove a generalization of the theorem for multiplication of 
determinants. 

THEOREM. If m <n, we have 


5D Sees > posteey 
len |2= } ” Bl 72 bal (45) 


nin ..<ly ke T3035 Tix Ly2y ee FM 


where the summation extends over all the 7, of the sequence 1, 2, ..., n, 
satisfying the inequalities indicated. If m > n, the determinant | cy |T 
vanishes. 
The meaning of the symbols 
A 5 2, ..., m™ 
Ss eee a 


ana. ip Po Tos eel 
1, 2, ..., m 


is given in [3]. The second denotes the determinant formed from the 
elements of the 7,7, -.-,7mth row and first, second, .-., mth 


7] RECTANGULAR ARRAYS 29 


columns of array (43). For m=n, the sum in (45) reduces to the single 
term corresponding to 7, = 1, 7, = 2, ...,7m = m, and (45) expresses 
the theorem for multiplication of determinants. 

We take the case m <n. The proof of (45) is analogous to that for 
the multiplication of determinants; we have as in the latter case: 


Ly 23 ven 
cul T= = 4| 


Syss-eSm (Sy, 8. 0 Sm 


ave tims. AO) 


sl 


where each of the s; can take the values 1, 2, ..., n, and where terms 
can be neglected for which some of the s, are equal, since such terms 
are zero. We take a definite sequence of numbers 7, <7, <<... <?fm 
from the sequence 1, 2, ..., n and we distinguish the terms of sum (45) 


for which the set 5, %, ..., Sm coincides with the set 7,, 7), ..-, Tm: 
This gives us part of sum (46): 
m 
A : i ee 47 
th oa nee to, oe aera #11%lte2 tam ( ) 


where summation is over all the possible permutations (t,, t,, .--., Em) 
of 7,7%,-.--,7m- On multiplying each term of (47) twice by 
(—1)!»t»---f), if can be shown exactly as in [6] that the sum is 


equal to 
4 wees M reece 
71. To; eee ase 9g 


All we need do to obtain the whole of sum (46) is to summate this 
product over all 7, <7,<...< 7m which gives us (45). Finally, 
suppose m > n. In this case we can add (m— m) columns of zeros 
to array (42) and (m — nm) rows of zeros to array (43). If, after this, 
we use the formula 


Cik = Ss iP ge (i, & = 1, 2, eee yg m) (48) 
s=1 


instead of (44), we get the same values of c,, as before, since the addi- 
tional terms on the right-hand side of (48) are zero. On the other hand, 
arrays (42) and (43) have now become square, the corresponding de- 
terminants being zero; and it follows from the theorem for multiplying 
determinants that | cj, |7 is zero, so that the theorem is fully proved. 

Remark. If two rectangular matrices each have m rows and n 
columns, multiplication of rows by rows: 


n 
Cik = > 4is?ks (i, k= I, 2,..., ™) 
s=1l 


30 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS (8 


gives us a determinant | ci, T, the value of which is zero for m > n, 
whereas for m < 7 it is given by 


leelZ= eae 


M<iy<.-- <lm W7y Ta, oe 0s My 


B|" De nalardcy "|. 
Ty, To 00 Tm 


CoRroLLaRy. Let us have two square matrices of order n with elements 
ay, and by, whilst the numbers cy, are defined by expressions (44). 
Pi Por- ++) Pr 


G1» Yar == + Tt 
terms of the minors of determinants | aj, |j and | by |j. It is easily 


ae ee 
O15 Jar +29 


We express any minor 7 | } of the determinant | cy, |{ in 


seen that the square array forming the minor C | 
product of the rectangular matrices: 


Goat Oper 245: Opn f 


} 1 
| Bigs Biggs ~~ + Ora, | 


pst» Apgar +++ + Apsn |! Bogs Baggs +++ > Bag, | 


| | 
{ 


Gp» Opyg --- > Opp |i 


Dri nag 2+ 4 Ong; 
On applying the relevant theorem, we get the required expression: 


C Pp Pas ees s Pr 
GM) Yar e-+s Ot 
ie Pe apes Toy Se. (49) 
min... <r Ty, Tas nee TY G1» Jor wees Lf 
where the 7, take their values from 1, 2, ...,”. Let Ra, Rp, Rc be 


the ranks of matrices || ai |[T, || Bix IIT, [leu [[f. If say Ra <n, and we 


take any J > Ra in (49), all the A Py Par ---s ?) will vanish by defi- 
eee 
nition of Ba, so that all the C Py Par -- + Pt likewise vanish. Hence 


Gy Jar +++ N 
it follows that Rc <1, ie. Re < Ra. If || ax ||i is of rank 7, it is 


obvious that Re < Ray, since Rc < n. Similarly, Rc < Rg. We shall 
show below that if the determinant |b, |i #0, we have Ro = Ra, 
whilst if | ax {i # 0, Re — Rp. 


§ 2. The solution of systems of equations 


8. Cramer’s theorem. Having described the nature and properties 
of determinants, we now turn to their application to the solution of 
systems of first degree equations. We start with the fundamental case 


8] CRAMER’S THEORBM 31 


when the number of equations is the same as the number of unknowns. 
We can write such a system of n equations with » unknowns as: 
Oy 1X Fb Aygh, + -.- + AyZy = 4, 
Oy, + Aap%y $F ..- + Aanhy = bp, a) 


Onyhy + Opg%o +... + Onrty = b,, 
the notation for the coefficients being similar to that used in [1] for 
the case of three equations with three unknowns. We shall make the 
assumption that the determinant of the system, i.e. the determinant 
corresponding to the array of the coefficients a;,, differs from zero: 
A=|a,| 49. (2) 
We multiply both sides of the ith equation of system (1) by the 
cofactor of the ith element of the Ath column of this determinant, 
i.e. both sides of the first equation are multiplied by 4,,, both sides of 
the second equation by A, and so on. We add the equations thus 
obtained. The result is an equation, on the right-hand side of which 
we have the sum 


Ay, + Agyby + mene: ar Ann 


whilst the coefficient of the unknown 7; on the left-hand side is given 
by the sum 


Ay fy, + Age + pice t+ AnGn (= 1, 2, ..., 2). 

This latter sum is zero for 14% and equal to 4 for 1— k, i.e. we 
reduce to an equation of the form 

A+ xy = Ayyb, + Agyby +... + Andy. 

On carrying out this procedure for each subscript k, we obtain a sys- 

tem of new equations as a consequence of (1): 
A-2, = Ay,b, + Agb. +... + Anda (k=1, 2,..., ). (3) 
It may easily be shown that, conversely, system (1) can be obtained 
as a consequence of (3). All we do is multiply both sides of the kth 
equation (3) by ay, then sum for all & from 1 to n. We again use pro- 
perty V of determinants and clearly arrive at the equation 

A + (Qy%y + Oj2_ +... + OpFn) = A-be (4) 
which, after cancelling the non-zero 4, gives us the ith equation of 
system (1). This procedure is possible for any J. Systems (1) and (3) 
are thus equivalent, and we can solve (3) instead of (1). System (3) 


32 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [9 


yields at once one and only one solution, given by 


gy Gai Aaah Bos Aaa (k=1, 2,..., 2). (5) 

We notice that, in view of our discussion in [3], the numerator of 
the expression written consists of the determinant obtained from 4 by 
replacing the elements of the th column, i.e. the coefficients a;, of 2;, 
by the constant terms 0;. This brings us to the following theorem. 

Cramer's TororeM. If the determinant A of system (1) differs from 
zero, the system has a unique solution defined by expressions (5). These 
expressions give each unknown as the quotient of two determinants, the 
denominator being the determinant of the system and the numerator being 
the determinant obtained from this by replacing the coefficients of the un- 
known in question by the corresponding constant terms. Cramer’s theorem 
is inconvenient in the case of a large number of equations with many 
unknowns; there are other methods that are approximate but more 
practical, though we shall not stop consider them. 


9. The general case of systems of equations. We take the general 
case of m equations with n unknowns: 
Ky Myyhy 1 Ayg®y es + OyTe Fy, ep Tep1 +--+ Ginn = by 
Xy = AyyXy Aggy Pe. Mayet Tr, np sXpgr b--- + Gant, = b, 
Kp FH Ayg®y TH Ape e.My + Sp, ep rXetr b+ ++ Unb = b, 
Xt = Mpa, 181, 1 Gets, 2b + Ot, ee 
Hey, cert HF ++ + Opts, tn = Oper 


(6) 


Ain Up hy Umeha + ++ Omen TOm, eeiteert +--+ Imnla=m- J 

The complete left-hand side of the sth equation has been written X, 
for the sake of brevity in later working. The coefficients a; of the 
system form a rectangular matrix, with rank say k. By rearranging 
tbe rows and columns, i.e. re-numbering the equations and variables, 
we can bring a non-zero determinant of the matrix of order k to the 
top left-hand corner. We call this the leading determinant of the system. 
It will have the form: 

Qj, Bz --- >» Az 


a a. eee a. 
A 2 219 229 > “2k : (7) 


Bey, Opa, - ++ 5 Uggs 


9] THE GENERAL CASE OF SYSTEMS OF EQUATIONS 33 


We form (m— k) determinants of order (k + 1) which are called 
characteristic determinants of the system and which are obtained from the 
leading determinant by adding a row of coefficients of an equation with 
a number greater than k and a column of constant terms. More precisely, the 
characteristic determinants are defined by the following expression: 


4); aya, » Ayn by 
Goi, Aga » ay Bz 
Ants = ey a a Y (8) 
| Qa; Ores wees Brey 6, 
Ants. Utse -+-> Oets,t, Onts 
(e+s=k+1, k+2,..., m). 


If & = m, ie. the rank is equal to the number of equations, no cha- 
racteristic determinants exist. We consider the further determinants 
obtained by replacing the last column of constant terms in a charac- 
teristic determinant by the left-hand sides of the equations: 


Gy, =Ay25 «ess Gp xy 
Go, Aaa, » Ox, Xe 

Seats cS a ae ee | (9) 
By, Bas 12+ > Uns Ay ’ 
Bets, Uets, 2 > ++» Bers, Xets 


These determinants contain z; along with the given coefficients aj. 
But it is easily shown that determinants (9) are identically zero. Since 
Xj = Gy hy Tigh, + --. 1b CinZp; 


the last column of any one of (9) is the sum of nv terms, so that, by 
property IV of [3], the determinant can be written as the sum of 
expressions of the form: 


2) Ay, soe Bye ay; 
O23; A295 sess Boys Go; 
. %;. 
Ory Oy, weet Orns Ay; 
+ 2ets, 12 Bkts, 22 ++ -9 Bets,mr Ukts,; 


The determinant appearing outside xz; is soon seen to be zero: if j < k, 
the last column is the same as one of the previous ones; whereas if 
j > k, the determinant is one of order (k -+ 1) appearing in matrix (6) 


34 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [9 


and vanishes because the rank of (6) is k by hypothesis. On subtracting 
the identically zero determinants (9) from the characteristic determi- 
nants, we can write the latter as follows: 


A a1.) , » DA b—X, | 
a), D2, ’ » Boxy b, — Xe 

Ag hy ben gticc de SiS, eh 4h hoe oy & (10) 
yy Oa, peses Bers b, — X 


Bytes, Bytsa: s-eer Bees, ks Dts — Xats 
(Ats=k+1, k+2,..., m), 


the dependence on the z; being merely apparent in this form. We now 
suppose that system (6) has a solution: 

2, =), 2 = 2, ..., t= 2. 
On substituting z; = af in the last column of (10), we get 2 column 
of zeros, i.e. all the characteristic determinants must be set equal to 
zero. 

THrorem I. The necessary condition for system (6) to have at least one 
solution is for all the characteristic determinants (8) to vanish. 

We now prove the sufficiency of the condition and give the method 
for finding all the solutions of the system. Thus, let all the character- 
istic determinants vanish. We take these in form (10) and expand 
by the last column. The cofactor of the element (6,4, — Xx45) is easily 
seen to be the leading determinant 4 which is not zero, and we can 
write the vanishing condition for the characteristic determinants as 


ayt) (by — X;) + aft (by — X_) +... + aft? (by — Ky) + 
+A (Bets — Xngs)=0 (K+ s=k+1, &+2,..., m), (11) 
where the a are numerical coefficients of no interest to us. 


Suppose now that we have a solution of the first & equations and 
that this is substituted for the 2, in identity (11). All the differences 


b,—X,, b,—Xy,..., b, — Xy 
now vanish, and we are left with 
A+ (bn4s — Xits) = 0 
or, since 44 0: 
bets —Xn¢g=O0 (Kt+s=k+1,4+2,..., m), 


9] THE GENERAL CASE OF SYSTEMS OF EQUATIONS 35 


i.e. if all the characteristic determinants vanish, any solution of the 
first k equations must also satisfy all the remaining equations. Thus 
all we have to do is solve the first & equations. 

We take all the unknowns with subscripts greater than k to the 
right-hand sides in these equations, so that they become 


yy Xy t AyyEe Hoe. ty aey = By — Oy, ptt. — +++ ~ Anta 
Gy Ly + Dy Ty 2. HP Ogy Sy = Dy — Oy, pp yep — +++ — Cann (12) 
Dyyhy T Dyan b .-- + Ops y = Oy — Oy, np rlnt — +++ — TinEn 

We consider the above as a system for determining 2, %, ..-, 2. 


Its determinant 4 is non-zero, so that we can use Cramer’s rule to 
obtain a determinate, unique solution. We only need to notice that 
the constant terms here contain 2,4,, ---, %,, to which we can assign 
arbitrary values. Cramer’s rule gives us the solution of (12) at once as: 


w= 0; + PQ + wa + pMz,, (j=1, Devs »k), (13) 


where a, and pe are numerical coefficients and 2,4,, .--, Zn remain 
arbitrary. It follows from the above that these expressions in fact give 
the most general solution of system (6) with the hypothesis made 
regarding the vanishing of all the characteristic determinants. 

Treorem I]. If all the characteristic determinants of a system vanish, 
only the equations containing the leading determinant need be solved, 
with respect to the unknowns whose coefficients make up the leading 
determinant. This solution can be found by Cramer’s rule and expresses 
k unknowns, where k is the rank of the matriz of coefficients, as linear 
junctions (13) of the remaining (n — k) unknowns, the values of which 
remain entirely arbitrary. All the solutions of system (6) are obtained 
in this way. 

On comparing Theorems I and Ll, we arrive at the conclusion: 

THeoremM III. The necessary and sufficient condition for the existence 
of a solution of system (6) is the vanishing of all the characteristic deter- 
minants of the system. 

We remark that, if k = n, ie. the rank is equal to the number of 
unknowns, there are no z; whatever on the right-hand sides of (13), 
and all the unknowns from 7, to a, are fully determined. 

THEeoREM IV. The necessary and sufficient condition for the system 
to have a unique solution is that all the characteristic determinants 
vanish and the rank of the matriz of coefficients is equal to the number of 
unknowns. 


36 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [10 


It may be remarked that the whole of the above discussion is 
clearly valid for the case when the number of equations is equal to 
the number of unknowns, i.e. m = 7. 


Example. We take the system of four equations with three unknowns: 
ao—3y— 2z=-1 
22+ y—- 42= 3 
a@+4y— 2z2=> 4 
5a+6y—10z= 10. 


We write down the matrix of coefficients: 


1, —3, — 2} 
le aa 
is “a2 
; 6, — 10]| 


Wo may easily verify that all the third order determinanis in this matrix are 
zero, whilst the second order determinant at the top left corner differs from zero. 
We can thus take the latter as the leading determinant, whilst the rank of the 
system is two. We form the characteristic determinants, of which there are 
two in the present case: 


1, —3, a I 1, —3, = 1 
As=|2.° 3, Bl 20r Apel Bd: 
1, 4 4 5, 6, 10 


Both these are zero and the given system is therefore consistent. Hence 
we only need to solve the first two equations with respect to z and ¥, z being 
taken to the right-hand side: 


z— 38y=2z-—1 
Qa+y =424+3. 
The solution is obtained in the form: 


ae ame 

_|4z+3, 1|_ 8 _ |2,4243| 5 

SS ea ge Le eal, ae 
21 2 | 


z being arbitrary. 


10. Homogeneous systems. A system is said to be homogeneous if 
all its constant terms 6; are zero. If the system has characteristic deter- 
minants, the last columns of these are made up of zeros and they 


10] HOMOGENEOUS SYSTEMS 37 


consequently vanish. Obviously, every homogeneous system has the 
solution 
=f... =—2,=—0 


which we shall speak of in future as the trivial solution. The funda- 
mental problemfor a homogeneous system is whether or not it has a non- 
trivial solution, and if it has, then what is the total set of such solu- 
tions? We start with the case when the number of equations is equal 
to the number of unknowns. The system becomes here: 


G10 + Qyp%y  ..- 1 AypZ, = 0 


Gyy%y > Gaga. -. 1 Aon%, = 0 (14) 


Any hy 1 Angk, +... + Appt, = 0 

If the determinant of the system differs from zero, there is a unique 
solution by Cramer’s theorem, and this is, in fact, the trivial solution. 
If the determinant vanishes, the rank & of the matrix of coefficients 
will be less than the number 7 of unknowns, (n — &) unknowns will 
thus have completely arbitrary values, and we shall have an infinite 
set of non-trivial solutions. Hence we arrive at the following basic 
theorem. 

THeorem I. The necessary and sufficient condition for system (14) 
to have a non-trivial solution is that its determinant vanish. 

A parallel may be drawn between the results obtained for the non- 
homogeneous system (1) and homogeneous system (14). If the deter- 
minant of the system differs from zero, the non-homogeneous system 
has a unique solution whilst the homogeneous system only has the 
trivial solution. Whereas if the determinant vanishes, homogeneous 
system (14) has non-trivial solutions, yet no solution of (1) in general 
exists, since the existence of a solution of (1) requires a choice of 
constant terms such that all the characteristic determinants vanish. 
These parallels are of great significance below. In problems of physics, 
homogeneous systems are encountered when considering free vibrations 
and non-homogeneous systems with forced vibrations; the vanishing 
of the determinant for the homogeneous system characterizes the 
presence of proper vibrations, whereas it characterizes resonance in 
the case of the non-homogeneous system. 

We now turn to a detailed discussion of the solutions of system (14) 
when its basic determinant vanishes. Let k be the rank of the matrix 
of its coefficients, where evidently, k <n. In accordance with the 


38 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [10 


theorem proved in the previous section, we have to take the k equations 
containing the leading determinant and solve these with respect to 
& unknowns. We can assume without loss of generality that these 


unknowns are 2, ...,2;. The solutions are obtained in the form: 
j= PY ey + aee + BYa, 7=1, 2, weeny k), (15) 
where the ig are definite numerical coefficients and 24, ..., Zn 


can take arbitrary values. 

A general property of solutions of system (14) should be noticed; 
this is a direct consequence of the linearity and homogeneity and 
may be designated the principle of superposition of solutions. Lf we 
have solutions of the system 


= A: 2, = 2; oo = 2); 2.5 w= (s= 1, 2,..., ”), (16) 


further solutions are obtained by multiplying these by arbitrary 
constants and adding: 


a, = Oye + C2 + Cy +... 4+ Ca (s=1, 2,..., 2). 


We use the same approach as in the case of linear differential 
equations [II, 26] and say that solutions (16) are linearly independent 
if no constants C; exist, not all zero, such that we have the equality 
for every s: 


t 
> Cx =0. 
i=l 


We can readily form (nx — k) linearly independent solutions of the 
system such that multiplication by arbitrary constants followed by 
addition gives us all the solutions. We return in fact to expressions 
(15) for the general solution and form solutions from these in the 
following manner: we put 24, = 1 and all the remaining 2,4, equal 
to zero in the first solution; in the second solution we put 2,42 = 1 
and all the remaining 2;,,, equal to zero, and so on; in the last, (x — k)th 
solution, we put 2, = 1 and all the remaining 2,4, equal to zero. 
The solutions obtained are easily seen to be linearly independent, 
since each contains one unknown equal to unity which is equal to 
zero in the remaining solutions. We denote these solutions as follows: 


toa) s poe Oe 1 ea (s=1, 2,...,M). 


We now take some given solution of system (14), obtained from ex- 
pressions (15) with the particular values: 


Ley = Vagus Fete = Veres +++ > t= Vn 


11] LINEAR FORMS 39 


It is clear at once that this solution is a linear combination of the 
solutions formed above, in fact: 


zs ma Yup eth + Vut2 airy + saass + Y pe (s = 1, 2, ie: es 8 n) . 


The total number of linearly independent solutions of homogeneous 
system (14) is equal to (xn — k) for any choice of linearly independent 
solutions. This point will be raised again later. 

We return to the general case of m homogeneous equations with 
n unknowns. If m < n, the rank k, which cannot exceed m, is likewise 
less than n, and (n — k) unknowns remain arbitrary, i.e. if the number 
of homogeneous equations is less than the number of unknowns, the 
system has non-trivial solutions. 

In general, k <7, and the system only has a trivial solution for k=n. 


11. Linear forms. The study of systems of linear forms is closely 
related to the problem of solving systems of first degree equations, 
A linear form of the variables z,, 2, ..., Zn means a linear homogeneous 
function of these variables. Let us have m such linear forms: 


Ye, = Oy + Oy% +... +458, (s=1, 2,..., m). (17) 


These forms are said to be linearly dependent if there exist constants 
@, 45, ..-, Gn, not all zero, such that we have the identity with respect 
to variables 2, 2, ..., Zp: 


QyYy + Yo t+ ..- + OnYm = O- 


If no such constants exist, forms (17) are said to be linearly independent. 
The coefficients of all the z, must be equated to zero in the identity 
written. Hence the identity is equivalent to the following system of n 
equations: 

QyQyy + On) + --- + On 8m, = 0, 

y2yz 1 Ay0g2 + --. + Op Om, = 0, 


QyG4n + OMe, + --- + Og8mn = 0. 


The forms y, are linearly independent when and only when this system 
of homogeneous equations in a,, a, ...,4m has only the trivial solution. 

The results obtained above lead to a number of conclusions regard- 
ing the linear dependence of forms. If m>n, the homogeneous 
system written certainly has non-trivial solutions and the forms are 


40 DETERMINANTS, THE SOLUTION OF SYSTEMS OF EQUATIONS {il 


linearly dependent. The necessary and sufficient condition for the 
forms to be independent is that the rank k of the matrix of coefficients 
pq is equal to the number of forms m. If m =n, i.e. the number of 
forms is equal to the number of variables, the necessary and sufficient 
condition for linear independence is the non-vanishing of the total 
square (m =n) matrix of Gpg. We speak in this case of the existence 
of a complete system of linearly independent forms. If m < n and forms 
(17) are linearly independent (i.e. k = m), the system of equations 
(17) is soluble for any values of y, with respect to the variables z, 
whose coefficients form a non-zero determinant of order &, ie. 
linearly independent forms can take any set of values y,. lf k == m = n,: 
all the variables 2; are defined for given y. 

We now take k < m. By suitable numbering of the forms y, and 
variables 2, we can arrange for a non-zero determinant of order & 
to stand at the top left corner of the matrix of ap,. With this, the 
first & forms, ¥,, Y,, ---, Yx are linearly independent, whilst each of 
the remaining form 4,4, can be expressed linearly in terms of the 
first & forms. This follows because the rank, equal to &, of the matrix 
of coefficients of the first k forms is the same as the number of forms, 
whence their linear independence. If we take (k + 1) forms y,, y, ..., 
Ys Yxat the rank of the matrix of their coefficients is still # and is less 
than the number of forms, i.e. the forms are linearly dependent, . 
so that there exist constants f, such that 


BY +--+ BY + Bra rYnaga = 0- 


The coefficient 8,4, in this relationship must differ from zero, since 
otherwise the first & forms would be linearly dependent. Hence we 
have a linear expression for y,4, in terms of the first & forms: 


= — Py __ _ Ps a —_ Br 
Yrt+y Bret yy Byer Y2 ones Bust Yr- 


The number k is called the rank of the system of forms (17). This number 
is equal to the rank of the matrix of coefficients on the one hand, and on 
the other, to the greatest number of linearly independent forms of system (17). 

Suppose we have & linearly independent forms %, Y ---; Yrs 
where k <n. We can assume that the kth order determinant at the 
top left corner of the matrix of a,, differsfrom zero. This system of k 
forms may easily be extended to become a complete system of n 
linearly independent forms. All we need do is take say 


Yaar = Uyays +3 Yn = Tn. 


12] : m-DIMENSIONASL VECTOR SPACE 41 


The determinant of tbe n forms obtained will be: 


| Qa, Bygy +++ > Qyys Opa +--+ > Qn! 


gy, Bogs +--+ Agpr Ae ptis +--3 On| 
Biya» Dies > Ons By, ka » Tn 
0, O , 9, |, 0, , 0 
0,. 0 » O, O, 1, , 0 
0, oO ..., 0, 0,0, ...,1 


On expanding this determinant by the last row, then the next to 
the last row, and so on, we see that its magnitude is equal to the kth 
order determinant at the top left corner, i.e. is non-zero. Thus the 
forms 4, %, ---; Yn are in fact. linearly independent. It follows that 
every system of linearly. independent forms can be extended to become a 
complete system of linearly independent forms. 


12, n-dimensional vector space. The results obtained above are open 
to a geometrical interpretation which will be useful later. We introduce 
for this purpose the concept of a vector in n-dimensional space, 
a vector being defined as a set of m (complex) numbers appearing 
in a definite order. Any such vector x ts characterized by a sequence 
of n complex numbers, known as the components of the vector: x(x,, 
Zo, +--+, 2n). Phe aggregate of all these vectors forms an n-dimensional 
vector space Rn. 

Two vectors are taken to be equal when and only when all their com- 


ponents are the same, i.e. if u(u,, Uy, -.-,; Un) and v(v,, %, ---, Un) are 
two vectors, the vector equation u = v is equivalent to the following 
scalar equations: wu, = %; UW, =; ---, U,= Un. We next define 


multiplication of a vector by a number and addition of vectors. 
Multiplication of a vector by a number amounts by definition to 
multiplication of all the components of the vector by the number, 
i.e. if vector x has components (2, %, ..-,%,), vector kx has com- 
ponents (k2,, kr, .-., kz,). Addition of vectors amounts to addition 
of their components, i.e. if we have vectors x(2,, %, ..., 2%) and 
(Ys Yo --+> Yn), their sum x+y has by definition components 
(yj + YY Lo + Yo, --+, In+ Yn). The null vector is defined as the 
vector (0, 0, ..., 0), all the components of which are zero. We write 


42 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS (12 


the null vector as 6. We obviously have @ = Ox, where x is any vector, 
and x +@=-x. Subtraction of vectors is defined thus: the vector 
x—y has components (z, — %, %2— Yo, --+,In— Yn). Obviously, 
x—x=0 and x—y=x+ (—l)y, ie. subtraction of vector y is 
equivalent to addition of y multiplied by (—1). We shall often have 
to write vector equations below. Any such equation is equivalent 
to n scalar equations, expressing the fact that corresponding com- 
ponents of each side are equal. Though we shall not use the symbol 8 
below for the null vector, it must be borne in mind that a zero appear- 
ing on one side of an equation is to be read as the null vector. The 
ordinary properties of addition and multiplication follow at once 
from the definitions given above: 


xtysytu xt+(Qvt+2=@+y) +z 
(k, + k,)x=kx+hx; k(x+y)=kx+ ky; ky (kx) = (kyk,)x. 
We can thus transpose or group together the terms in a vector sum 
with any number of terms. From the equation x + y = 2, it follows 
that x= z-—y and y = z—x, and conversely, from x — y =z it 
follows that x = y +z. 


We now introduce the concepts of linear dependence and _ inde- 
pendence for vectors. The vectors 


x), x), 02, xt (18) 


will be said to be linearly dependent if there exist constants C,, ...,C,, 
not all zero, such that 


C,x + 0,x9 4+ ...+C0x0=0. (19) 
If no such constants exist, vectors (18) are said to be linearly indepen- 
dent. We write the components of vectors x as (2), 2, ..., 2%). 


Condition (19) is clearly equivalent to the system of n equations with 
unknowns C,,C,, ..-,0;: 


eC, G2 x0, eee xC, =0 
D0, + aPC, +... + 2600, = 0 
2OC, + 220, +... +20C,=0. 


By using the results obtained above for homogeneous systems, 
we can easily draw a number of conclusions and interpret them 
geometrically. Let us first take | > n, ie. the number of vectors is 
greater than the number of spatial dimensions. With this, the number 


(20) 


12) n- DIMENSIONAL VECTOR SPAOB 43 


of equations in the homogeneous system (20) is less than the number of 
unknowns, ‘and, as we know, the system certainly has non-zero 
-solutions for the C;, i.e. the vectors are certainly linearly dependent. 
In other words, the number of linearly independent vectors is at most 
equal to the number of dimensions. We now take the case / = n. Here 
system (20) contains as many equations as unknowns and has non- 
zero solutions when and only when its determinant vanishes, i.e. if 
we have n vectors in n-dimensional space, and form a determinant 
from the n? components, locating say the components of a given 
vector in a given column, with the rows having the same numbering 
as the components, the necessary and sufficient condition for linear 
independence of the vectors is the non-vanishing of this determinant. 
The magnitude of the determinant is analogous to the volume of a 
parallelepiped in real three-dimensional space. 

We can consider the elements (bi, bez, ..-, On) of each column in 
any determinant | bj, | of order 7 as the components of a vector b”, 
the magnitude of the determinant being here a function of the n 
vectors h™, ..., bh. The vanishing of the determinant is equivalent 
to the fact that the vectors are linearly dependent. 

The magnitude of the determinant, considered as a function of 
vectors b™, is written 


| Bix | = A (b®, b®), wees b®) . 


On recalling that the magnitude of a determinant changes sign 
on interchange of two columns, we can say that the function 4 
merely changes sign on interchange of two of its arguments. Such a 
function is usually said to be anti-symmetric. It may readily be seen 
that for instance the Vandermonde determinant D,, considered 
in [5], is likewise an anti-symmetric function of its arguments z,, 

a 

We return to system (20) and the question of the linear independence 
of vectors x, ...,x, on the assumption that 1 < n. Let k be the 
rank of the matrix formed by the components 2. If k=1, as wesaw, 
the system has only the trivial solution, i.e. the vectors are linearly 
independent. Whereas if k < 1, the system certainly has a non-trivial 
solution, i.e. the necessary and sufficient condition for vectors to be 
linearly independent is for their number to be equal to the rank of the 
matrix formed by their components. We now assume k <1, ie. the 
vectors are linearly dependent. We distinguish among these the k 
vectors whose components contain a kth order non-zero determinant 


44 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [12 


(there may be more than one way of doing this). By what was proved 
above, these & vectors are linearly independent. Each of the remaining 
vectors is easily seen to be linearly expressible in terms of the chosen 
vectors. In fact, let x, ..., x be the linearly independent vectors. 
On associating with these any vector xt, we get (& + 1) vectors, 
which will be linearly dependent, since the rank & of the matrix of 
their components is less than the number / = k + 1. Hence constants 
C; @=1,2,...,%, & +s) will exist, not all zero, such that 


Cx + Ox + 2. + Ox + 0,4 x9 = 0. 


We certainly have C,4,40 here, since otherwise vectors x 


x®, ...,x™ would be linearly dependent. Hence the equation gives us 
PC ee ee; ee ee eee a 
Cys Og-s * k+s 
ie. x*+®) ig expressed linearly in terms of x, ..., x. Let x, 


x®, ...,x be any 2 linearly independent vectors. We can take as 
an example of these: , 


(1, 0, 0,... 0); (0, 1, 0,...0); ...3 (0,0,0,-.-, 1). (21) 


If we take any vector we please, x, the (7 +- 1) vectors x9 ~@ 


x, x are in fact linearly dependent, as we have seen: 
Cx + C3x@ +... + Ox + Cx =0, 


the constant C being unquestionably non-zero, since otherwise vectors 
x™, ...,x™ would be linearly dependent. It follows from the above 
that any vector x is expressible linearly in terms of n linearly independent 
vectors: 

x= o,x + ax@ 4 ...+ax™ (a, =— a : (22) 
It may easily be seen that the expression for x interms of x, x 
x is unique. If, in addition to the above expression, there existed 
the further expression 


x= fx + Bx@ 4+... + Bx, 


where the £, differ from the corresponding a,, subtraction of the two 
expressions would give us 


(a, — By) x® + (0, — Ba) XO +... + (An — Bn) x = 0, 


ie. vectors x, ...,x are linearly dependent, which is false. If we 
take vectors (21) for x, ...,x, the a, in (22) are evidently the 


peers 


12] 7-DIMENSIONAL VECTOR SPACE 45 


components 2, of the vector x(x, %, ..., 2%). We can also speak of 
the a, as the components of x in the general case, when x, x, ..., x 
are taken as the fundamental vectors. On assigning all possible complex 
values to the numbers a,, we obtain all the vectors of our 7-dimensional 
space. We now suppose that we have & linearly independent vectors 


x), x), ears x(*) (23) 
where k <n. The set of vectors obtained in accordance with 
y = CxO + Ox + 2. + Ox, (23,) 


where the C, are arbitrary constants, is said to form a k-dimensional 
subspace L,. It can be shown as above that any vector belonging to L, 
is uniquely expressible in terms of x x™ _.., x. In other words, 
vectors (23) form a subspace Ly. 

We notice that, if any vector z belongs to L,, ie. is expressible 
by an equation of type (23,), the vector cz, where c is any constant, 
is evidently also given by an equation of type (23,), ie. also belongs 
to L,. Similarly, if 2 and 2 belong to L,, their sum 2 + 2@ 
also belongs to L,. Hence a more general property follows at once: 
if vectors 2) ZZ belong to Ly, any linear combination of 
them, y,2 + y,2 +... + yp2 also belongs to Ly. 

We take any m vectors belonging to Ly: 


y = COxM + COx@+...409x (s=1, 2,..., m). (24) 
In view of the linear independence of vectors (23), a relationship of 


the form 
ay + ay +... any” = 0 


is equivalent to a system of k homogeneous equations in a,, a, ..., ax! 
a,0M + 4,02 +... + 4,0 =0 (q=1, 2,..., &). 


If this system has a non-trivial solution, vectors (24) are linearly 
dependent. In particular, if m > k, non-trivial solutions certainly 
exist, i.e. any set of more than & vectors of the subspace formed by 
vectors (23) is a linearly dependent set. It follows at once from this 
that the subspace formed by linearly independent vectors (23) cannot 
be formed by using a set of linearly independent vectors z™, ..., 2, 
the number of which is 1 < k. For otherwise, by what we have proved 
above, there could not exist more than / linearly independent vectors 
in the subspace, whilst on the other hand, the linearly independent 
vectors (23), the number of which, #, is greater than J, have to belong 


46 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [12 


to the subspace. If we take any k linearly independent vectors u™, 
u®, ..., 0%, belonging to Ly, they form, in the sense indicated above, 
the same subspace Ly. 

This follows because, by definition of subspace, any linear com- 
bination 

Cy + Cy +... + Cn 

belongs to L,. Whereas if we take any vector y of L,, the (k + 1) 
vectors a, ...,u“, y belong to L,and are therefore linearly depend- 
ent, by the above: 


B® + Bu® + ... + 6a + yy = 0, 
where y must be non-zero since a, ..., a are linearly independent. 
This gives us the result that any vector y of LZ, can be expressed 
in terms of n™, ..., u, ie. these latter vectors in fact produce L;,. 
If m = & in expressions (24), and the determinant of the coefficients 
C® differs from zero, y™, ...,y must be linearly independent 
vectors of L,. It is easily shown that in general the number of linearly 
independent vectors yielded by expressions (24) is equal to the rank 
of the matrix of C®. 

We saw above that if a vector z belongs to a certain subspace L, 
the vector cz, where c is any constant, also belongs to L; and if 2“ 
and z” belong to L, their sum also belongs to L. We might have 
given another definition of subspace, viz, a subspace ts a set of vectors 
such that, if z belongs to L, cz belongs to L, whilst if z and z™ belong 
to L, (2 + 2) also belongs to L. An immediate consequence of this 
is that any linear combination of vectors belonging to L also belongs 
to L. We have just seen that the properties forming part of the new 
definition follow as corollaries from the first definition. We can show 
conversely that the first is a consequence of the new definition, i.e. 
the two definitions are equivalent. 

Let x™ be a certain vector belonging to L. By definition of L, 
vectors C, x, with arbitrary C,, also belong to L. If L is altogether 
exhausted by these vectors, we must have LZ, in the previous sense. 
If this is not the case, and a vector x®), linearly independent of x, 
appears in LZ, the vectors C,x™ + C,x®, with arbitrary C, and C,, 
belong to L. If L is altogether exhausted by these vectors, L is an L, 
in the previous sense. If the opposite is the case, an x® appears in L, 
such that x, x@, x are linearly independent. By proceeding in this 
way, we can exhaust L completely by means of a finite set of linearly 
independent vectors, the number of these being not greater than n. 


13] SCALAR PRODUCT 47 


The greatest number k of these linearly independent x gives us the 
dimensions of the subspace L. If it happens that k = n, L coincides 
with the total n-dimensional space. 

We note a point in connection with the formation of subspaces. 
Let the vectors x, x, ..., x be linearly dependent. We can now 
say, as before, that formula (23,) defines a subspace L. Let the first 
1 vectors: x, x®, ..., x, be linearly independent, whilst each of 
the remaining vectors: x“), ...,x can be expressed linearly in 
terms of the first 7. The set of vectors defined by (23,) is now clearly 
the same as the set defined by 


y HC + Ox + 2. + Cx, 
(k) 


i.e. the subspace L defined by the linearly dependent x, ..., x™, 
is 7-dimensional (1 < ). 

Let us take real three-dimensional space and agree to measure 
vectors from a fixed point O (the origin). Here, n = 3. With k = 1, 
the subspace L, is a straight line passing through O, whilst L, is a 
plane passing through O. 


13. Sealar product. We use the following notation: if ais a complex 
number, @ is the complex conjugate of a, and |a| is the modulus 
of a. We thus have aa =|/a|?. If a is real, ad =a and |a|? = a?. 
We now introduce a new concept, of great importance for what follows. 

Derinition. The scalar product of two vectors 


K (2, %.--, 2) and y¥(Yy Yo --- + Yn) 
is defined as the number represented by the sum 
ze —_ 
> ts: 
s=1 


We shall denote the scalar product by the symbol (x, y). We have: 


ni n 
(% y) = S29, (y, x)= Sy, 2, 
g$=1 


s=l 


whence it follows that 
(y,x)=(%y)- 


We say that two vectors are perpendicular or orthogonal to each other 
if their scalar product is zero. Inasmuch as the conjugate of zero is 
zero, the order of the vectors in the scalar product has no importance as 


48 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [13 


regards the condition for orthogonality. Obviously, the null vector 
(0, 0, ..., 0) is orthogonal to any vector x. 
The properties 
(ax, y)=a(x, y); (x, ay) =a(x, y), 
where a is a numerical factor, follow at once from the definition of 
scalar product. Furthermore: 


(x+y, z)=(%,z)+ (7.2); (y+z2z)=(% y) + (x2), 


the distributive law being valid for any number of terms. We have, 
for instance: 


(x+y, a+ v)= (xu) + (x, ¥) + (ya) + (y, ¥)- 
We form the scalar product of x(z,, 2, ..-, Zn) with itself: 


nm n 
(x,x) = > %s%3 = > |2s ?. 
S=i— s=al 
We thus get a real number, positive for non-zero vectors x, and zero 
for the null vector (0,0, ...,0). The square root (numerical value) 
of the real number (x, x) ts called the norm or length of vector x. On using 
|| x |] to denote the norm, we can write: 


IIx |? = (x) = S10 x= Vex =| 31st 


s= 
The equation ||x|| = 0 is equivalent to the fact that x is the null 
vector. Suppose we have three mutually perpendicular vectors x, y 
and z, i.e. 

(x,y) =0; (x,z)=0; (y,z)=0. 
On using the distributive law for scalar products and taking into 
account the equations written, we get: 


(xtyt+z,x+y+z)=(%x)+(,y)+(@2) 


or 
[x+y + 2|? = ijxlP + ly |P + [lz iP- 
This expresses the theorem of Pythagoras. It is valid for any number 
of terms, with the essential proviso that the terms are orthogonal in 
pairs. We show that if the vectors x, x, ..., x, none of which is 
the null vector, are orthogonal in pairs, they are linearly independent. 
We take 
1 
> Cx =0 


s=l 


14] GEOMETRIOAL INTERPRETATION OF HOMOGENEOUS SYSTEMS 49 


and show that all the numbers C, must be zero. We form the scalar 
product of both sides of this equation with x“, where & is one of the 
numbers 1, 2, ..., 1: 


t * 
C, (x, x) = 0. 
pa ( ) 


Since pairs of the x® are orthogonal, we have (x®, x) = 0 
for s#k; hence the above equation gives: C,(x™, x) = 0, ie. 
C;, ||x™ |? = 0, whence, since || x ||? > 0, it follows that C, = 0, 


this being true for any choice of k. 


14, Geometrical interpretation of homogeneous systems. We take the 
homogeneous system 


Dy Zy + Ay %y +.» 1 Gn T= 
Ag, Ly + Ogg %_ 1. -- + Gan Lp = 9 (25) 


On, Ly + One® +... Opn hy = 


We bring in the vectors 


a) By, Byas 5 Ban)s «5 A Tpty Tpas ---+Znn)» (26) 
System (25) can now be written in the compressed form: 
(x, a) = 0; ...; (x,a™)=0, (27) 


so that the problem amounts to finding a vector x, perpendicular 
to all the vectors a. If the determinant | a,,| differs from zero, 
the determinant | aj, |, with conjugate magnitude, also differs from 
zero. In this case, the vectors a” are linearly independent, and 
system (27) only has a trivial solution, i.e. there exists no vector 
(apart from the null vector) which is simultaneously perpendicular 
to n linearly independent vectors (in n-dimensional space). 

We now take the case when the determinant of system (25) vanishes, 
Let the rank of the system be &. If a matrix is formed of the conjugate 
elements, the determinants appearing in it will be conjugate in 
magnitude to the determinants appearing in the array of a;,, and 
the rank of the conjugate matrix will also evidently be k%. Hence, by 
what we have shown above, there will be & linearly independent 
vectors among the a”, the remainder being linear combinations of 
these. We can suppose without loss of generality that these linearly 
independent vectors are 


a®, .., al, (28) 


50 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS {14 


whilst for the remainder we have expressions of the form 
alt+) — Bet aM + + pea (k4+s=k+1,k442,...,2), 


where the £@ are numerical coefficients. It now follows at once that 
if x is orthogonal to vectors (28), it is perpendicular to all the 
vectors aW. In fact: 


ko 
x, atk+9)) — x, Bitt) a 
( ) = (x, Bi ) 
and the sum as a whole is zero since each individual term vanishes 
by hypothesis. It is thus sufficient to solve the first & equations of 
the syatem. Assuming, as usual, that a non-zero determinant of order 
k is at the top left corner, we get (xn — k) linearly independent solu- 
tions x™, ..., x—" for the required vector x by the method indicated 
in [12], and every solution will consist of a linear combination of 
these (n — k) vectors. We can say in our present case that the vectors 
given by 
y=C,a +... + C,a®, 

where the C; are arbitrary constants, form a k-dimensional space L, 
which is in fact a subspace of the total n-dimensional space. In the 
same way, the vectors obtained, x™,...,x°-”, form an (n— k)- 
dimensional subspace M,,. The subspace M,_, is orthogonal to 
the subspace L, in the sense that any vector of M,,_, is orthogonal 
to any vector of LZ, (and conversely, of course). The subspace M,_, 
consists of the vectors which satisfy system (27), i.e. are orthogonal 
toa, a®, ...,a. The n vectors a,..., a,x, 2... x" are 
readily seen to be linearly independent. For suppose, on the contrary, 
that a relationship exists between them: 


(cya + 22. Hoga) + (d, xO +... +d,_,x0"-9) = 0. (29) 


The first bracket yields a vectora of Z,, and the second a vector x 
of 1f,-,, and we now have a-+ x= 0 or a= —x. But a and x are 
orthogonal to each other, ic. a must be orthogonal to itself, in 
other words, (a, a) = 0 ora=0O which means that a is the null 
vector. The same can be said of x. Hence: 


6, a + eee + Cy a®) —0 and d, xO + eee +d,,x*%=0. 
But a™, ..., a are linearly independent by hypothesis, and all the 


constants c, must consequently vanish; and the same can be said 
as regards the d,. All the coefficients in (29) therefore vanish, i.e. 


14} GEOMETRICAL INTERPRETATION OF HOMOGENEOUS SYSTEMS 51 


vectors a, ..., a“ x@, ..., x" are in fact linearly independent. 


? 


Every vector x can be uniquely represented in the form 


x= (ya +... + y,a) + (6x94 22. 46, 2-4), 


the first bracket giving the vector belonging to L, and the second 
the vector belonging to M,-,. As we have already mentioned, the 
vectors that compose M,-_, are all the possible solutions of system 
(27), and hence, whatever the choice of the total system of linearly 
independent solutions, the number of these solutions is equal to 
(n — k), ie. equal to the number of dimensions of Jf,-,. The earlier 
discussion of homogeneous systems leads to the following important 
result. 

If L, is a k-dimensional subspace (k <n), the vectors orthogonal to it 
form an (n — k)-dimensional subspace Mn-,, and every vector x of Ry 
can be written as the sum x = y + 2, where y belongs to L,andzto My_+. 

We show that the representation of x as a sum is unique. Suppose 
that, in addition to the above, we have x = u + v, where u belongs 
to L, and vy to M,,_». We want to show that u = y and v =z. We have: 
yt+z=u + v, whence y — u = v —z. The difference y — u belongs 
to L,, whilst vy — z belongs to M,_,, whence it follows that y — u is 
orthogonal to itself, i.e. (y— u, y —u) = 0 or || y — a || = 0,80 that 
y—u=0 and y =u. Since y—u=—v-—z, it now follows that 
v=z. In the representation of x as x=y-+z,y is known as the projec- 
lion of x on the subspace Ly. The vectors y and z are orthogonal, and 
Pythagoras’ theorem gives: ||x|/?= || y ||? + ||z[[?, whence we have 
lly || < || x ||, the sign of equality being obtained when and only when 
zis the null vector, i.e. when x belongs to L,, so that y = x. Similarly, 
\|}z{| < ||x||, and the sign of equality is obtained when and only 
when x is orthogonal to Z,, ie. z= x. We usually describe L, and 
M,,_, as complementary orthogonal subspaces. Lf k = n, Ly is the whole 
of &,, whilst 4, reduces to the null vector. 

Let us take real three-dimensional space that we discussed above, 
and let k = 2, so that n —k—3—2=1. The subspace L, is a 
plane P, passing through the point O, whilst Jf, is a straight line 
passing through O and perpendicular to P. Any vector can be uniquely 
represented as the sum of two vectors, one of which lies in the plane P, 
whilst the other is along the line K. We have interpreted geometrically 
the solution of a homogeneous system in the case when the number of 
equations is equal to the number of unknowns. The general case can 
be treated in precisely the same way, when the number of vectors a® 


52 DETERMINANTS. THR SOLUTION OF SYSTEMS OF EQUATIONS [15 


is not necessarily equal to n. Similar remarks apply as regards the 
next article. 


15, Non-homogeneous systems. We take the non-homogeneous 
system: 
yy By “b Aya Ty P+ +. + Ayn Ty = Oy 


Og, Ly Ogg Fy «~~ + Aon Ly = by (30) 


yy Ly Nye Ly + aes + Gan Fy = Op. 


This can be interpreted as the problem of finding the vector 
X(2, Za, +--+, %m) from the system: : 

(x, a) =b;... ; (x, a) = b,. (31) 
given the vectors (26). 

If the determinant of the system differs from zero, Cramer’s theorem 
provides a unique solution. Suppose the determinant vanishes and 
the rank of the matrix of its coefficients is &, a non-zero determinant 
of order & being situated at the top left corner as usual. Along with 
system (30), we write down the system of homogeneous equations 
whose coefficients are obtained from the coefficients of the given 
system by replacing rows with columns and all numbers with their 
conjugates. The system will take the form: 


Gy Y, + OY. + --- TF Yn = 9 
Bye Yy + Gono + --- +a ¥,=9 


a 


Bin WY, 1 Gon G2 + --- + Onn Yn = 0. 


As before, the matrix of its coefficients hasrank k& and a non-zero kth 
order determinant stands at the top left corner. The homogeneous 
system is known as the adjoint of system (30). We have seen above 
that its general solution is a linear combination of (n — &) solutions 
(vectors) which can be obtained, for instance, by using Cramer’s theorem 
to solve the first & equations with respect to y,, ..-., Yx, the remaining 
Yx4s being put equal to zero except for one which is put equal to unity. 
This method brings us, with y,4, = 1, to the system: 


(32) 


Gy Yy + a Yat - +. FGA, = — Gear 


2 — — =n 
Bio Yr Gen Yn + --- Ee Yn = — G41,25 


By Yy 1 Bae Yon F - 2 DU Ye = — Ueprx- 


15) __NON-HOMOGENEOUS SYSTEMS 33 


On solving this system and taking conjugate values, we get: 


In=—S% = (m=1,2,...,8) (33) 
Geta 13 Gee = Feta = +--+ =I =O; 
where 
yy, Ag, -- - » Oey 
Ae Qy2, Boa, -- + 1 Ago #0, 
| Dx Bays -- + 1 One 
and 4, is found from 4’ by substituting @,4,) - - -, @xis,x for the elements 


of the mth column. We write the condition for vector b with compo- 
nents (b,, ..-,5,;) to be perpendicular to the y which we have obtained 
just now by solving system (32): 


k 
A; 
(b,y) = — 2 a Om + nts = 0 
or 
k 
— > bby + 4b5, = 0. (34) 
=) m=1 


On interchanging rows and columns in the determinant 4), then 
moving the mth row’ to the final position with the aid of (& — m) 
interchanges of subsequent rows, we get: 


Qy15 Ayes » Ay 
Om—1,1> Gm—1,2> --- » m—1,k 
nd +1 
— Am =) Omar» Omeie> -- ++ Imtie (1), 
| xa» Bro» -- +9 Qn 
| Geer Oper, --+s Mare | 


This is precisely the cofactor of the element b,, in the characteristic 
determinant: 


| G41» Aye, » Ay,» by 
A = 
ky Dyy, Oya, + + Orr Dy ; 
Berrys Up122 -°75 Dit asks Oped 


so that condition (34) in fact expresses the vanishing of the character- 
istic determinant. Similarly, with y,4;—=1, we get the condition 


54 DETEBMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [15 


Ax. = 0. We thus arrive at the following result: éf the determinant of 
system (30) vanishes, the necessary and sufficient condition for the system 
to have a solution is that the vector (b,,..-., bn) should be orthogonal to all 
the vectors yielding solutions of the homogeneous adjoint system (32). 

The general solution of system (30) is the sum of any particular 
solution of the system and the general soution of the corresponding 
homogeneous system obtained by replacing all the 3; in (30) by zeros. 
The general solution of the homogeneous system will contain (n — k) 
arbitrary constants. 

A further geometrical interpretation of the basic theorem regard- 
ing the solution of systems may be pointed out, since it is of importance 
later. Let us take n linear forms with n independent variables: 


We shall suppose that the z, can take any complex values, (y,, ..-, 
Yn) being regarded as the components of a certain vector. If the deter- 
minant | ad, | is not zero, we get definite values of 2; for any given y,, 
and the previous formulae yield the whole of the n-dimensional space 
y- Now let the matrix ||a;,|| have rank r< 2. We can assume without 
loss of generality that the 7th order determinant at the top left corner 
is not zero. With this, the basic theorem regarding the solution of 
systems tells us the following: the set of values (y,, ..-, yn), obtained 
in accordance with the previous formulae, possesses the property 
that the values y,, ..., y¥, may be chosen arbitrarily, yet once these 
are fixed, the remaining y,.;, ..., yn are fully defined, being obtained 
from the vanishing condition for the characteristic determinants. This 
means in geometrical language that the previous formulae yield an 
r-dimensional subspace, formed by the vectors that are obtained by 
putting one of the y,(s = 1, 2,...,7) equal to unity and the remainder 

-equal to zero. All in all, then, if the rank of matrix || a, |! is 7, the 
previous formulae yield a set of values (y,, ..-, Yn) defining an r-dimen- 
sional subspace. 

We have taken the case when the number of linear forms is equal 
to the number of variables z,. We have in the general case: 


Yy = Ay Ty + --- $A, 


Ym = Uns X_ > --- + Amn Ey: 


16] GRAMS DETERMINANT, HADAMARD'S INEQUALITY 55 


With arbitrary z;, these formulae now define a subspace in the 
m-dimensional space, the number of dimensions of the subspace being 
equal to the rank of || aj, ||. The proof is the same as above. 


16. Gram’s determinant. Hadamard’s inequality. Let us take m vectors: 
x) (x$, x, wiara’y x) (8 =1, 2,...,m). 
We form the mth order determinant of the scalar products (x'9, x) and 
introduce the special notation: 
G (xO), x@, ..., x) = | (x, x) | m= 
H (x, x(t), (x0), x@)), ..., (x), x(™)) 
=| (x@), x), (xf), x@)),..., (x), x!™)) ; (35) 


[ex 0), (a0, a), (x, 2) | 


This is known as the Gram determinant of vectors 
x, x0, xl, 
We distinguish the cases 
m=n, m<n and m>n. 
The general term of the Gram determinant has the form 
) ~@), _ TT) 
(x), x) — 2" x, 


With m =n, determinant (35) is equal to the product of the determinants: 


2,2, xD | xD, x2) x 
eee a ee ae 
SS ean || ee se eee 


the roultiplication rule of rows by columns being used. On noticing that the 
determinants are unchanged in value on interchanging rows with columns, 
we can say that the second factor is the complex conjugate of the first, so that 
with m =n» the Gram determinant (35) is equal to the square of the modulus 
of the determinant | x{ |", formed by the components x‘ of vectors x,, x, 
+++, X,- Hence determinant (35) is positive if the vectors are linearly independent, 
and zero if they are linearly dependent [12]. With m # n, we have two rect- 
angular matrices: 


“Q) 
xD, x) x) x, x?, x” 
(2) ¥@) (2) WD. 32 (mn) 
> % *¥a | (36,) and | %2> %2>--+>%2 J, (36,) 
Ps ae aa cee 


56 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [6 


and the matrix corresponding to determinant (35) is the product of these two last 
matrices [7]. By the theorem proved in [7], determinant (35) vanishes for 
m > n.In this case, however, vectors x), x, ._.,x'™ are linearly dependent 
[12]. With m <n, we have by the theorem proved above: 


G (x, 2, 2, x) = > x{ 1,2, ...,m che wanes 
M<Ta<. <n Ty Ta, ---5Tm 1,2, ...,™ 
where x( Saree bk denotes a minor of array (36,), and Y ie Fetes 3 
Tir Tay --- 5 1, 2,...,™ 


@ minor of (36,). As above, the Y here is the conjugate of the X, and the last 
equation can be written: 
1,2,. 2 
x( els (37) 
Th» rt, .. +9 Tm) | 


If x9), x, |.., x" are linearly independent, the rank of matrix (36,) is equal 
to m [12], and at least one of the non-negative terms on the right-hand side of 
equation (37) is positive. If the vectors are linearly dependent, on the other hand, 
the rank of matrix (36,) is less than m, all the mth order determinants appearing 
in the matrix vanish, and (37) implies that G(x", x, ..., 2°) = 0. Thus all 
three cases: m = n,m > nandm <n, lead us to the following general theorem: 

THEOREM. The Gram determinant G(x, x, ...,x°) is positive if the vectors 
x), x®, 1... x" are linearly independent, and zero if they are linearly dependent. 

We now prove a further formula for the Gram determinant. As a preliminary, 
we decide on the following notation. Let x be any vector of FR, and let the expan- 
sion be valid: x = y +2, where y belongs to the subspace defined by vectors 
xD, x®, ...,x", and 2 is perpendicular to this subspace. We want to prove 
that 


G (x, x, xf) = > 
<<... <hn 


G (x, x, 222, x, x) = iz}? G (x, x, 22, x), (38) 
On taking into account the equations 
(x, x) = (x, y); (x, 2) = Cy, x) 


which follow from the orthogonality of x to all x“, and the equation (x, x) = 
= (y, y) + (,z) [13], we can write: 


G (x, x?) .. re ‘ x)= 
1 (x), 2), M2), OM ™), (x,y) 


(x), x()), (x), x), vung (x2), x™), (x®), y) 

ili de Ge ty OS lene ey tee Wants nc a eurene eek cas ate wt Meco 
| (x), x), (x, x), Sag (x, xim)), (x™, y) 

i (x), (yx), 25 (n2™), (ry) + 2) 


On writing the elements of the last row as 


(y, x) +0, (y, x) +0, ..., fy, x) +0, (yy) + (2) 


16] GRAWS DETERMINANT. HADAMARD'S INEQUALITY 57 


then expressing the determinant as the sum of two.determinants in accordance 
with property IV of [3], we get 

G (x, x), x, x) = G (x, x), 222, x, y) + 
(x, x), (x, x), 222, (x, xl), (x™, y) 


| 
| (x, x), (2,2), 2.., (2), x) (x), y) 


(39) 


ry 
a 
Li 
8 
3 
Lo 
& 
oe 
a 
Lan 
3 
La 
3a 
or 
ra 
a 
| 
4 


0, 0, pores 0, 2? 


The vector y belongs to the subspace defined by x, ..., x" and is therefore 
linearly expressible in terms of the x“; thus, by the theorem just proved: 


G (x, x®), tees xi™), y) =0. 


On expanding the determinant of (39) by the last row, we in fact get (38). 
The inequality: 


G (x, x@, x), x) <x PG (x, 2), 22, xl). (40) 


is an obvious consequence of (38). It may be remarked that, if the 2 are 
linearly dependent, we have 


G (x, x, 22. xO) x) = G (x, x, ..., x) = 0. 


If the x® are linearly independent, the sign of equality is obtained in (40) 
when and only when y = 0, i.e. when x is orthogonal to all the x, 

Repeated application of inequality (40) to the original Gram determinant 
G(x, ew, ..., x) gives us 


G (x, xO), 2.2, 2) <x Fl IP (41) 


It must be borne in mind here that G(x‘) = |] x‘? ||?. 

We have the sign of equality in (41) when and only when any two of the 
vectors are orthogonal (on the assumption that none is the null vector). This 
inequality leads easily to an imequality applicable to any determinant. Let 
4 be an nth order determinant with elements a;,. We shall look on the elements 
of the ith row as the components (ay, 42, ...,@j,) Of a vector x® of R,. We 
form a new determinant with ¢,, the conjugate elements to the a, this deter- 
toinant being obviously equal to 4. The product of 4 and 4, multiplying rows 
by rows, is the Gram determmant G(x), ...,x™), its value being equal to 
4A, i.e. | 4j2, by the theorem regarding the multiplication of determinants. 
Application of inequality (41) now leads us to Hadamard’s inequality for the 
modulus of a determinant: 


fei n n 
[AR < Slax. Sault. > lone? (42) 
kal =1 k=1 


58 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS (17 


If 4 has real elements, we can write: 


n 


< Sok. Soah.. + oo (43) 


k=l = 
If the elements of the determinant satisfy 
| ain] <M (i,k =1,2,...,n), 
it is obvious that 


Sion ax, | z<nM?, 


and we now have from (42): 
[4] <n"? mM”. (44) 


It follows from our remarks above that the sign of equality is obtained in 
(42) when, end only when, any two of the vectors x“ are orthogonal. 

We can obtain further inequalities for Gram determinants on the basis of 
@ generalization of inequality (40). 

Let X, Y, Z denote respectively sets of vectors of En. The generalization in 
question has the form: 


G(X, Y,Z)G(X) < G(X, Z)G (X,Y). (45) 


This does not exclude the case of an empty set, ie. one containing no 
vectors at all. If W is any such set, we have to take G(W) = 1. 

On the basis of this inequality, we can write the following for Gram deter- 
minante: 


mm 1f(m—2) 
G (x0), 20), 2) < | TT G ®, «.., x6), x9, 0, , 
k=1 


where JT is the product sign. Repeated application of this last expression leads 
to new expressions in which the Gram determinants contain a smaller number 
of vectors. In all these expressions, the sign of equality is obtained when, and 
only when, any two of the vectors are orthogonal. The expressions just given 
are due to M. K. Fage (Dokl. Akad. Nauk SSSR, 1946, 54, No. 9). 


17, Systems of linear differential equations with constant coefficients, 
We apply the results obtained to the problem of integrating systems 
of linear differential equations with constant coefficients. We take 
the system: 


=O, 5 +2.%4.- | 
Ry = Day Xy + Aggy +. + Ganrn | 


? 
Enq = Oy + One X +--+ + Snankn 


(46) 


17] SYSTEME OF LINEAR DIFFERENTIAL EQUATIONS WITH CONSEANT COEFFICIENTS 59 


where the 2; are required functions of t, the z; are their derivatives, 
and the a are given constants. We shall seek a solution in the form: 
oc — be"; eo bets c.g ey be. (47) 
On substituting in system (46) and cancelling the factor e“, we get 
a system of equations defining the constants 8, ..-, Bn: 
(44, — 4) by + 2b, +.-. -+ 4,6, =0 
Gy By + (Aon — A) bz +... + Gand, = 0 (48) 


Onn Oy + Ong bz +... + (Bnn — ANB, = 0. 


Since a non-trivial solution is required for the unknowns 6,, the deter- 
minant of this latter system must vanish, i.e. we have an equation of 
the form 


Gy, —A, Gy, ---5 Ayn 
Oy, Ogg— A, ..-, Con =0. (49) 
Ons Ang, +++ Ann — A 


for the constant 4. 

An equation of this type is generally known as a secular equation. 
It is familiar in the study of unconstrained vibrating mechanical 
systems in the particular case when the matrix of coefficients aj, is 
symmetrical, i.e. aj, — d,;, and all the coefficients are real; this is a 
matter that we shall discuss later in connection with smal] vibrations. 
For the present, we shall discuss the general case. Equation (49) is 
an algebraic equation of degree n with highest term (—A)"; if it has 
n different roots: 

A=; .--3 A=A,, 


substitution of each root 4; for 4 in the coefficients of system (48) 
gives us n homogeneous equations for the corresponding B,...., Bn 
with a vanishing determinant, so that a non-trivial solution in fact 
exists. Hence we have, from (47), 7 linearly independent solutions of 
system (46), and a linear combination of these yields the general 
solution of the system. If secular equation (49) has multiple roots, the 
solution of the problem is more difficult: to each root of (49) of mul- 
tiplicity & there must correspond k linearly independent, solutions of 
system (46), one solution having in fact the form (47), whilst the 
remainder in general contain a polynomial in ¢ as a further factor. 
It must be remarked that the possibility does occur here — as is not 


60 :. DETERMINANTS, THB SOLUTION OF SYSTEMS.OF EQUATIONS {17 


the case with a. single equation with constant coefficients [II, 40] — 
of more than one (perhaps every) solution corresponding to a multiple 
root having the form (47). We shall not stop to consider this matter 
in more detail because a different method will later be used for solving 
system (46), based on the theory of functions of a complex variable. 


We return to secular equation (49) which is fundamental to our problem. 
The solution, even if approximate, of this equation presents practical difficulties 
for large n, due to the fact that the unknown 4 appears along the diagonal and 
not in a single row or column. Expansion of the left-hand side in powers of 4 
requires a large number of computations, as indicated above in [4]. We shall 
describe a method of transformation of equation (49) to a form more convenient 
in practice, by means of which the unknown 4 is brought into a single column. 
This method is due to Prof. A. N.. Krilov, who gave the first exposition of it 
in his article ‘‘The numerical solution of equations determining the frequencies 
of small vibrations in material systems in engineering” (Izv. Akad. Nauk 
SSSR, 1931). ; 

We form a linear combination of the required magnitudes: 


§ = Og, Ly > Ayg Ve -P-.--+ Gon Las (50) 


where the a); are numerical coefficients chosen in any manner, We now dif- 
ferentiate equation (50) times with respect to t, each time replacing the deriv- 
atives 2x’; on the right-hand side by their expressions from system (46). We 
get the (x + 1) equations: 


Ea Ty + G222.  +--- +n a 
FH ete +---+ On Tp 
a Bde Wana eewe Roker se Ay aibensd. Swe eae tane Be (51) 
nL : 
see Onna T + Ope Ze +--+ Ena Tn 
EM ae, tag, ct. f..-t dan Zy 


Let the determinant formed from the coefficients a, appearing in the first 
n equations differ from zero. The first n equations then give us expressions for 
the x, in terms of &, ¢’, ..., &*-D, and substitution of these expressions in the 
last equation gives us an nth order equation for £. Elimination of the x; from 
the (x + 1) equations (51) can be carried out directly with the aid of determinants. 
We first re-write these equations as 


€xy + Oo, 21 + yg Fe P+. + Gyn Fy = 9, 


EE qT Ay, Ty + ys Ve ++ T Ay Tq = O; 
1 t 
He) + am 2) + Ope Te + + Epp ly, =, 
where z, = — 1, then we.consider these as a homogeneous system in the magni- 


tudes 


Loe Ly ---) Eqs 


17] S¥STEMS OF LINEAR DIFFERENTIAL BQUATIONS WITH CONSTANT COEFFICIENTS 61 


The determinant of this homogeneous system must vanish, and this in fact 
gives us the required result of elimination: 


| é, fo og «++» Fon 


8, Qity Gigs ++ +> in P (52) 


5 
| ge), Any Ogg, ---2 Fn 


We shall seek the solution of this equation in the form 
_ ef, 


On substituting this in the first column of determinant (52), taking the factor 
e* from the column outside the determinant sign, then cancelling it,we get the 
following equation for A: 


| 1, doy: Goss ++ +s Gon 


ca oe a 0. (53) 


Cr 


ee 
A’, Ors Eger ---> Onn 


Tt may easily be shown that, given our assumption, equation (53) has the 
same roots as (49). For, let A = A, be @ solution of (53); then we have a solu- 
tion of (52) of the form: . 
&= Ce, (54) 
where C@ ig an arbitrary constant. The first m equations of system (51) now 
give us solutions of type (47) for the 2, with A= A,, ie. A= A, is im fact a 
root of equation (49). Conversely, if A = A, is a root of (49), we have a solution 
of type (47) for the x; with A= A,, where the b; are numerical constants, not all 
of which are zero. On substituting these Pepratans for the x; in the first of 
equations (51), we in fact obtain a solution for £ of type (54), this solution being 
certainly non-zero, since otherwise we should have 

==... = eV _—0, 
whence it would immediately follow from the first m equations of system(51) 
that 

%=%g=...=2,=0. 

Thus every root 4 = 4, of equation (49) is in fact a root of equation (53). 
We have now shown that, given our assumption, equation (53) has the same 
roots as (49). Applications of this method to numerical examples, together 
with a discussion of the case when our assumption no longer holds, may be 
found in the article by Prof. Krilov quoted above. 

Simpler working is obtained if formula (50) is taken as £ = x. In this case, 
(53) becomes 

1, 1, 0,...,0 
A, Gyr Ger ees Gin =0.F 


eo 8 @ © 2 we we ee 


ru 
A, Ans Ines ees Ong 


Tt A. Danilevskii has proposed a neat method for transforming the secular 
determinant in Mat. Sbornik, 2, Sec. 1. 


62 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [18 


We consider, instead of (46), the system of second order equations: 


T= Ay, % + ete t+... + Op Ty 
Ty = yy Dy + Age Vet .-- + Aan Zp (35) 


Dn = Gqy ZT, + Gye et+ ... + Ann Tp: 


Systems of this type are often encountered in mechanics. If we seek 


a solution in the form 
x, = 6; cos(4t + 9), 


we obtain an equation for A of the form: 


Gy, +4, Gy, .--, Tin 
Gy, ag +A, ..., Aan =0. (56) 
Any ane, 2: fan + » 


The constants 5, are defined by a system analogous to system (48), 
and o remains arbitrary. 
Finally, with systems such as 


THA 2+... + yp tate at... +e, 
Ty = Ag, ZT +... + Gon Ty, + Cg1 Tj + ~~. + Con Th (57) 


= Oy By +... + Opn Bat On + ++. + oan Th 
which also include the first derivatives, we again seek a solution of 
type (47), and arrive at a secular equation of the form: 


yy + 0A — A, Aye + Cyd roseer Gy t Cyn A 


PE Pee ee et os, || Orie) 
On t+ nA, Ong + Cn A poseey Epp t Cnpd — A 
If we introduce the supplementary unknowns: 
Tnt1 = Ts Tate = % ---3 Men = Ty (59) 


we can reduce system (57) to 2n first order equations, n of these 
being obtained from (57) by substituting 
j= 2pyjandryj=—%4; (J=1, 2,...,%), 


whilst the remaining n are equations (59). 


18, Functional determinants. Let us take n functions of n variables: 
F(X, 2g, *e -) Zn); Po (2X, Tey. +1 %n)i oeey Pn (yp Ze, a2, +» Tp)- (60) 


The functional determinant of these functions in the variables x, is 
the mth order determinant whose elements are given by aj, = 09;/02,. 


18} FUNCTIONAL DETERMINANTS 63 


We bring in the special notation for the functional determinant: 


5p, OM 09, 
Oz,” Oz,’ * Cay 
ap, a, ap 
D press? fi =, yc ees zs 
Dees =| Ba Be Orn | 61) 
OP, 29n OPn 
Qn,’ Oc,” ’ Or, 


We have already encountered determinants of this type in the change 
of variables in multiple integrals [II, 57 and 60]. If we have the change 
of variables on a plane: 


L==P(U,0); y= plu,o), (62) 
where the point (u,v) becomes the point (z, y), the absolute value 
of the functional determinant (Jacobian) 


Dy, ¥) 
Die, ®) (63) 


gives the coefficient of change of area at the point (u,v) under trans- 
formation (62), on the assumption that the partial derivatives of the 
functions of (62) with respect to wu and v are continuous, and that 
determinant (63) does not vanish, in the domain over which the 
transformation is applied. Similarly, if we have the point transfor- 
mation in three-dimensional space: 


L = P(% Jo Fs)s Y= PUM Fa Ga)s Ze = O(Y, Gr Ya) » 


where the point with coordinates (¢,, 92,93) becomes the point (z, y, 2) 
and the volume (V,) becomes volume (V), the formula for change 
of variables in the triple integral may be written [II, 60]: 


SS. fe.y.2)dedyde = [ff flo, v, 0] |D| da, da das 
(Vy) 


where 
Dy, ¥, @) 
D = > _ , 
D( Gis Ye» Ga) 


and | D| is the coefficient of cubical change at a given point on 
transforming from (4, q, 4) to (2, y, 2). 

We might have considered the single function of a single independ- 
ent variable: 


u = f(x) 


64 DETERMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS [18 


in exactly the same way, as a transformation of points on the axis OX, 
in which the point abscissa x takes up a new position with abscissa w. 
The absolute value | /’(x) | of the derivative obviously characterizes 
the change in linear measure at a given point. Everything that has 
been said may be extended to point transformations in n-dimensional 
space and to the change of variables in n-tuple integrals [II, 98]. 

Having explained the two and three-dimensional analogies between 
functional determinants and derivatives, we now show that there are 
analogies as regards their formal properties. 

Let us take the system of functions 


PrYar +> Yn)s +2 PalYr ++ -2 Yn)» 


and instead of y, ..-, Yn being independent variables, let them be 
in turn functions of Zy, -++, Im, 80 that in the last analysis the 9; are 
functions of the z;. We can form three functional determinants: 


DP -- +s Pn). D(Py ---2 Pn). Diy v--s Yn) 
Diy +--+) Un)? Dye) Z_) ? Dy, «++, Tq) 


~" The elements of these determinants are respectively 


Og; . OG; Oy, 
Gy,’ Oxy,’ = Ory 


But we have by the rule for differentiating functions of a function: 


ns a a 89; | BYn 

FF Gyn Bey 
and the determinant multiplication rule of rows by columns gives us 
an equation expressing the first property of functional determinants: 


DO. -- +1 En) = Dv wees Ga) _ PY -- +s Ya) . (64) 
Dey ---y Eq) Dar -2-r Yn) Dl ys = +s Fy) 


This is analogous to the rule for differentiating functions of a func- 
tion with a single independent variable. 

A further property of functional determinants is as follows. The 
system of functions @; can be considered as a transformation of variables 
x; to the new variables ¢;: 


Pi = PX, ---,%,) (1=1,2,..., 2). (65) 
We first notice the’ particular case of the so-called identity trans- 


formation: 
Py HX, Po %5-2-, PpHr,- 


18] FUNCTIONAL DETERMINANTS 65 


Its functional determinant is 


1, 0, 0,...,0 

0, 1, 0,...,0 

0, 0, 1,...,0,;=1 
| 0, 0, 0, =| 


We shall imagine that equations (65) have been solved with respect 
to the x;, so that the x; can be written in terms of the 9;: 


2 = 2(Py,---:Pn) (¢=1,2,...,%). (66) 


Transformation (66) is naturally known as the inverse of (65). If 
expressions (66) are substituted in the right-hand sides of (65), we get 
the identities: ¢, = 93 ---, rn = Pn, Or in other words, we get the 
identity transformation. On now applying formula (64) to this parti- 
cular case, we have to put y; = 7; and x; = 9;, whilst we have the 
functional determinant of the identity transformation on the left-hand 
side: 

D(@y ++ +s Fn) = DG, ++ Pn) , D(Ey ---. En) 
D(Pys +--+ Pp) De, coer Ey) DP ---3 Py) 


or 

Dy, -+-19n) Dey +++, tn) 

D(Zy, -++9Fn)  D(Pys---» En) 1, (67) 
ie. the product of the functional determinants of the direct and inverse 
transformations is unity. This is analogous to the property of the deriv- 
atives of inverse functions in the case of a single independent variable. 

We now explain the meaning of the condition that the functional deter- 
minant 
Dy Pos + ++1 Fn) 


Dawes (68) 


of the functions 
Py (Ey - +s Ty)s Pa(Ty, ---2En)5 ++ -3 Fn(Ty - +++ Fy) 


with respect to the variables , is identically equal to zero. Suppose that these 
functions are connected by the functional relationship 


F(q, ---,%,) = 90, (69) 


this equation being an identity in the independent variables ,. If we differentiate 
with respect to all the independent variables, we get the n identities: 


OF, Oa, oe On Oxy cay (70) 
OF. ag, Ges OF ag, —0. 


66 DETRAMINANTS. THE SOLUTION OF SYSTEMS OF EQUATIONS {19 


We can look on these 7 identities as linear equations in the » quantities 
oF oF 
ap,’ 


where it is clear that the quantities cannot vanish identically at the same time, 
since otherwise / would contain none of the g;. The determinant of homogeneous 
system (70) must therefore vanish, which amounts to the vanishing of functional 
determinant (68). The presence of functional relationship (69) thus implies 
that functional determinant (68) vanishes identically. We shall not dwell 
on the proof of the converse, which is also true, i.e. the vanishing identically 
of functional determinant (68) is the necessary and sufficient condition for a 
relationship to exist between the functions 92, ---, &p)-T 

We take the example of three functions of three independent variables: 


P= TEETH G.H At Tet Ty Py BAT +2,7, +273. (71) 
It may easily be verified that the following relationship exists between these: 
g — 9, — 2¢,=0. 
We form the functional determinant for functions (71): 


2z7,, 27, Qe. 
DG Fas Ys) = 1. 1, l 7 
Dy X25) Te @y Ty Fy 21+ Ts 


We suggest that the reader show that this determinant is identically zero. 


. 


19. Implicit functions. We proved the existence theorem for the implicit 
function defined by a single equation in Vol. I [I, 159]. We now generalize this 
for the case of a system of equations. The original theorem will first be re-stated: 
let = = %, y = y, be a solution of the equation 

F(z,y) =0 (72) 


and let F(z, y) and its first order partial derivatives be continuous at and in 
the neighbourhood of « = z,, y = y,; also, let the partial derivative F(z, y) 
differ from zero at r=2z,, y = ¥,. For zx sufficiently close to 2), equation 
(72) now defines a unique function y(z) which is continuous, has a derivative, 
and satisfies the condition y(x,) = y,. As we have already mentioned, it can 
similarly be shown that the equation 


F(z,y,z) =0, 


having the solution rz =a, y=Y, 2—=2%, where F(z,y,z) and its first 
order partial derivatives are continuous in the neighbourhood of this solution, 
and F(z, Yo, Zo) # 0, uniquely defines a function z(x,y) which is continuous 
in the neighbourhood of sr =z, y = yy. possesses derivatives with respect 
to z and y, and satisfies the condition z(z,, y,) =z). We now consider the sys- 
tem of two equations: 

pr, y,2)=0; vlz,y,z)=0. (73) 


{It must be pointed out that our discussion regarding system (70) is of a 
formal nature and is not, strictly speaking, a proof. 


19] IMPLICIT FUNCTIONS 67 


Let this system have a solution z= 2, Y =Yy 2 =%, let plz, y,2), plz, y, 2) 
and their partial derivatives be continuous in the neighbourhood of the solution, 
and let the functional determinant 


As Ao 
Dip, vy) _| %° % | _ dp oy Op dp (74) 
Dy, 2) | dp dp | Sy O Ge oy 

ey? be 


be non-zero at the solution. With these conditions, and with x sufficiently close to 
Zq, system (73) defines a unique system of functions y(x), z(z) which are continuous, 
have first order derivatives, and satisfy the condition y(Zo) = Yo, (Zo) = %- 
Since expression (74) differs from zero at t= 2y, Y = Yo, Z = Zp, at least 
one of the partial derivatives Oy/dy or @p/éz must differ from zero. Suppose 
that say $p/dz is non-zero at the solution. By the theorem stated above, the 
second of equations (73) uniquely defines a function z(zx, y). On substituting 
this function in the first equation of the system, we get an equation in the 

variables z and y: 
giz, y,2(z, y)] =0. (75) 


It only remains for us to show, in order to prove the theorem, that the partial 
derivative with respect to y of the left-hand side of (75) differs from zero for 
= = 2, ¥ = yo. This partial derivative is given by 
Op) _ op Op dz 
(ap) ap + ee ay e) 
where (09/dy) is the total derivative of 9(z, y, z) with respect to the argument 
y- Since z(z, y) is the solution of the second of equations (73), we have the 
identity: 
pla, y.z(z,y)] =90. 
We differentiate this identity with respect to y: 


wy te ey (77) 


We multiply both sides of (76) by @y/9z and add to (77) after multiplying both 
sides of (77) by —0¢/dz. This gives us after simple working: 


oy (32) = Die. ¥) 


ez ay)” Dy,z) 


The function z(x, y) becomes z, with x = 2, yY = Yo, and dy/dz and (74) differ 
from zero with these values of the variables, so that (@p/dy) is also non-zero. 
Consequently equation (75) defines a unique function y(x). On substituting 
this in z(x, y), we in fact obtain z as a function of x. This proof is possible with 
several independent variables instead of z. 

The implicit function theorem may be stated as follows in the general case: 
Let the system of equations 


Fy (yy 0+) Sy Yar os Yn) = 03 203 PalSy - ++) Sap Yroees 2 Ya) =O, (78) 


68 DETERMINANTS, THE SOLUTION OF SYSTEMS OF EQUATIONS {19 


have the solution 


k=1,...,m : 
a= ae, my =u ( Teg a)? os 


let the F, be continuous and have continuous first order partial derivatives in 
the neighbourhood of solution (79), and finally, let the functional determinant 


1 OF, OF, OF, | 
| oy,” dy,’ 7? Q a 
OF. oF oF 
DM 3s25 Fed a| Gye? Be z | 
eee aa Se ee a a 80 
DGGE he a ne” ca ry 
| 
(OF, OF, oF , 


i 


differ from zero at solution (79). Then for 2, sufficiently close to ez, equations 
(78) define a unique system of functions y;(2,, ..., ®) that are a cme possess 
Jirst order derivatives, and satisfy the conditions y (zm bee, TOY = yO, 

We shall sketch out the proof of this theorem. We ones it to be true for 
(n — 1) equations (it is in fact valid for nm = 1 and m = 2) then show that it is 
true for m equations. By expanding determinant (80) by its first column, we 
can say that at least one of the corresponding cofactors must be non-zero for 
values (79), since (80) is itself non-zero at the solution by hypothesis. We 
can choose the subscripts for the F',; in such a way that the cofactor of oF',/ay, 
is non-zero. This cofactor consists of the functional determinant of F,, ..., Fy, 
with respect to variables y,, ...,¥,. By the theorem for the system of (n — 1) 
equations, the equations 


By (yo Ba Yor 9 Ya) 05 05 Py (Gps Beth Ya) =O (81) 
uniquely define the functions 
Yo = F2 (Bus +03 Sms Ya)s ++ 3 Ya = Pn (Dr +++ » Le Y)- (82) 


On substituting these functions in the first of equations (78), we get the equation 
for ¥,: 
Py (@) --- Zap Yur Far - ++ 1 Pn) =O. (83) 


It remains for us to verify that the total derivative of the left-hand side of this 
equation with respect to y, differs from zero for values (79). The derivative 
is given by 


oF ‘ or OEE a: 2 OF, _ Bs 
s—_ |= >: 84 
(5, l ey + 2 te, ap, Oy, oo 
On substituting functions (82) in the left-hand sides of equations (81), we obtain. 
identities which we differentiate with respect to y,: 


oe OF; O95 


oS a 0 (=2,...,n). (85) 


Let. 4,, 4:,..., 4, denote the cofactors of the elements of the first column 


19} IMPLICIT FUNCTIONS 69 


of (80). On multiplying (84) by A, and (85) by A, then subtracting the latter 
equation from the former, we obtain the equation: 


oF, OF FP, , 180s 
4(a,)= = Saat S| Sa ie 


The first sum on the right-hand side yields determinant (80) which we shall 
write simply as D for brevity, whilst the summation over / in the second term 
represents the sum of products of elements belonging to a column other than the 
first of D with the cofactors of the corresponding elements of the first column, 
i.e. the sum is zero. We remark here that differentiation with respect to 9, 
is exactly the same as differentiation with respect to y,. The above equation 


thus reduces to 
PF 
A () = 
g Oy, 


Since A, and D do not vanish at solution (79), the same can be said regarding the 
derivative with respect to y, of the left-hand side of equation (83); hence (83) 
yields a unique function y,(z,, 2, ..., Z,). Substitution in functions (82) gives 
us the final result. 

The inversion theorem for systems of functions is a particular case of the implicit 
function theorem. Leé the equations be given: 


Up =ty (Gy ---8_) (k= 1,2,--.,2). (86) 


Let the functions f, and their first order derivatives be continuous in the neighbour- 
hood of x, = 2© (k = 1,2, ..., 7), for which values the functional determinant 


D (hy .-- > fn) 
.. (87 
D (a, --- » Eq) 2) 
ts different from zero. Then equations (86) uniquely define x,(y,,..., ger as 
functions of Y1,---,Yn im the neighbourhood of y@ = f(x, <y h (0), 


functions so continuous and having first order derivatives, whilst se ese 
ay, 2.0, y) = 2. 


We prove this sheonsin simply by teking the equations 
Ty (Zp +++ 5 Ly) — Yg =O (A =1,2,...,n) 
and applying the implicit function theorem, the role of y, being played by zz. 
If the f, are linear homogeneous functions of the variables z;, system (86) 
has the form: 
Yk = TF Bplet--- +r ADy- 
Determinant (87) reduces in this case to the determinant | a;, | of the coefficients 


ax, and the existence of a unique solution of the system depends on Cramer’s 
theorem. 


CHAPTER ITI 


LINEAR TRANSFORMATIONS 
AND QUADRATIC FORMS 


20. Coordinate transformations in three-dimensional space. A linear 
transformation in n variables is defined by: 


Wy = Oy Ty +f Ayyhy  --- + Oy Zp 
Hy = By ZX, + Aggty +... + danhp (1) 

Bp = Bp By + Anghe + --- + AnnTp- 
This can be interpreted as the passage from a vector (2, ..., Zn) 
of n-dimensional space to another vector (xj, ..., z/). Alternatively. 


we can regard (z,,...,Z,) a8 the coordinates of a point in n-dimen- 
sional space and (1) as the passage from this point to another. 

Yet another interpretation is possible: we can regard (z,,...,0,) and 
(vj, ..., Z,) a8 the components of the same vector (or coordinates of 
the same point), but with different choices of axes. Expressions (1) 
now give the transformation of components (coordinates) on passing 
from one coordinate system to the other. Expressions of type (1) with 
n= 2and n= 3 have already been encountered a number of times. 

The first part of the present chapter is devoted to a detailed study 
of linear transformation (1). We start with real three-dimensional space 
for the sake of greater clarity, then pass to the general case of complex 
n-dimensional space. Our discussion for three-dimensional space begins 
with the most elementary case, when (1) corresponds to passage from 
one set of rectangular axes to another. On measuring vectors from the 
origin, we can obviously take (21, 2, Z,) a8 either the components of a 
vector or the coordinates of its terminus. 

The expressions for transforming Cartesian coordinates are familiar 
from analytic geometry: 

zy = Gy ,% + Ayp%_ + Ays%g 
Wy = GX, + AzpMhy + Ayg%s [5 (2) 
= Ag, + AgoX, + Aggy 


70 


20] QOORDINATE TRANSFORMATIONS IN THREE-DIMENSIONAL SPACE 71 


where the a; are the cosines of the angles formed by the new axes 
with the old and are given by the following table: 


| | x,|x,| xX. 
| Xia. | a2 | a 
|X [an | tn | aan 
| X5 | dsr | ae | oes 


(3) 


1 


We know that the array of coefficients in (3) has the following 
properties: the sum of the squares of the elements of each row and 
column is equal to unity, and the sum of the products of corresponding 
elements of two different rows or columns is zero. The magnitude of 
the determinant 

|x| 
is clearly equal [5] to the volume of the rectangular parallelepiped 
with unit sides directed along the new axes, i.e. it is unity if the axes 
have the same orientation, and (—1) if the orientation is different, 
The inverse transformation from (xj, 22, £3) to (24, %, 25) will clearly be: 


Ly = AyyX] +H AqyXy + Ags % 
Mg = Ayyt, + dag + Aggy f - (2) 
Lg = AygXy + AggXo + Aggty 


In other words, the inverse transformation to (2) is simply obtained by 
interchange of rows with columns in the array of coefficients of (2). 
The determinant of the inverse transformation is obviously equal to 
the determinant of (2). 

We now show that the properties mentioned of the coefficients of (2) 
can be obtained by satisfying a single requirement that follows at once 
from the geometrical nature of our problem. We look for all the real 
transformations of type (2) such that 


wi? ag? + a? = 2 + of 4 2. (4) 


This statement of the problem enables us to generalize our discussion 
of transformations to the case of space with any number of dimensions. 
What we do is show that the transformations required by the new 
problem are the same as those discussed above, i.e. we show that 
requirement (4) leads to the previous relationships between the aj,. 
We substitute from (2) in the left-hand side of (4), remove the brackets, 


72 LINRAB TRANSFORMATIONS AND QUADRATIC FORMS [20 


then equate the coefficients of the squares of the variables to unity, 
and the products of different coefficients to zero; this gives us six 
relationships of the type: 


yyy Aap; + Ags; = 9,, (k,l = 1, 2, 3), (5) 
where 
6,1 => 0 for ks Z and Skk = 1, (6) 


i.e. the sum of the squares of the elements of each column is unity and 
the sum of the products of corresponding elements of different columns 
is- zero. These conditions are generally known as orthogonality conditions 
in regard to columns. It now follows immediately that the elements 
of each column are the direction-cosines of a certain straight line, and 
that the straight lines corresponding to different columns are mutually 
perpendicular. This implies in turn that in the present case trans- 
formation (2) coincides with that considered above, and that we have 
orthogonality in regard to rows as well as to columns. 

We can look on (2) as a transformation of space with fixed axes, 
instead of a coordinate transformation in a fixed space. Suppose first 
that the transformation determinant is equal to (+1), ie. both 
systems of axes have the same orientation. We can now rotate the 
space like a solid body about the origin together with the axes 
(X}, X$, X3) so that these axes coincide with (X,, X,, X,) which we 
take to be fixed during the rotation and to which we refer the co- 
ordinates of every point both before and after rotation. If a point 
had the coordinates (21, 2, 3) before rotation, it takes up a new posi- 
tion MW’ as a result of rotation and has the new coordinates (2{, 24, 24). 
Since the point Mf moves with the axes (Xj, X3, X4), the coordinates 
(x1, £3, 73) of MZ’ with respect to (X,, X,, X,), with which (X{, X3, X3) 
have come to coincide as a result of the rotation, will be the same as 
the coordinates of M with respect to (X{, Xj, X3) before rotation. 
Hence it may be seen that expressions (2) represent, in the case of the 
(+1) determinant, a transformation of the coordinates of a point as 
a result of rotating the space. 

Now let | aj,| be equal to (—1). We consider instead of (2) the 
transformation 


x; = — ByX, — Be%y — Ars (t = 1,2, 3) i. 


Its coefficients possess properties (5) as before, whilst its determinant 
now has the value (+1), i.e. it corresponds to a rotation of the space 
about the origin. In order to obtain the coordinates (z}, 72, 73), we 


20] COORDINATE TRANSFORMATIONS IN THREE-DIMENSIONAL SPACE 73 


have to carry out the further transformation: 
aj=—aty t= — ah; aaah 


which is a symmetry transformation about the origin, Inasmuch as 
the signs of all the coordinates are changed. Thus transformation (2) 
corresponds, in the case of the (—1) determinant, to a rotation of the 
space about the origin followed by a symmetrical shift with respect 
to the origin. 

We saw above that the nine coefficients a;, have to satisfy the six 
relationships (5). This means that they are expressible in terms of three 
independent parameters. We shall 
indicate one possible choice of para- 
meters in the case of a rotation of 
space about the origin. 

We bring in two systems of coor- 
dinate axes: (Xj, X43, X3) is a fixed 
system to which all the coordin- 
ates are referred, whilst (X,, X., X53) 
has an invariable relationship with 
the rotating space. In order to de- 
fine the rotation, we have to es- 
tablish three parameters defining 
the position of the second system of axes relatively to the first. Let 
the planes Xi OX} and X, OX, intersect in ON (Fig. 1). We make a 
definite choice of direction along this line and let a be the angle 
Xj ON, reckoned from OX. We also introduce the angles B = X3 OX, 
and y = NOX,. These three angles completely characterize the position 
of the second system relative to the first, i.e. they completely cha- 
racterize the rotation which we shall denote by the symbol {a, 8, y}. 
It follows at once from the above that our motion is the result of 
consecutively carrying out the following three motions: (1) rotation 
by angle a about the axis X3; (2) rotation by the angle 8 about the 
new position of Xj; (3) rotation by angle y about the new axis X;. 
These three angles are generally known as Euler’s angles, and we can 
write their limits of variation as follows: 


O<a<2a; O< Pu, O<y< Qn. 


If 8 = 0, the motion {a, 8, y} simply reduces to a rotation by the 
angle a + y about axis X;, and we have in this sense for any 6: 


{a, 0, y} = {ce + 6,0, y — 5}. 


74 LINEAR TRANSFORMATIONS AND QUADRATIO FORMS [21 


This shows that, for certain cases of parameters {a, 8, y}, the ro- 
tation of the space about the origin is not single-valued, in other 
words, the same rotation corresponds to different values of the para- 
meters. Expressions may readily be deduced for the coefficients aj, 
in terms of trigonometric functions of the angles a, 8, y [cf. 62]. We 
shall deal later with a further choice for the parameters characterizing 
a rotation of space about the origin, and we shall also return to 
Euler’s angles. 


21. General linear transformations of real three-dimensional space. 
We shall now consider real linear transformations of type (2) with 
arbitrary coefficients, though it will always be assumed that the trans- 
formation determinant differs from zero: 


[az,|#0. (7) 


The transformation is usually said to be non-singular in this case. If it 
does not satisfy conditions (5), it is related toa deformation of space 
[II, 113]. It should be noticed that the characteristic feature of (2) 
is the matrix of coefficients which implies a fully defmed rule for 
passing from any vector with components (Z,, 2, 73) to a new vector 
with components (xj, 22,73). We shall use a single letter to denote 
the total matrix: 
| Qy1> Ayo, 2,3 
Qo1, 223, Qos | 
231, Agq Agg | 


the matrix being written between double strokes as before to distin- 
guish it from the determinant. We shall write the determinant of 
matrix (8) as D(A). This is some determinate number. We shall write 
transformation (2) symbolically as 


A= ’ (8) 


x’ = Ax, (9) 


where x’ is the vector with components (z;, zz, 73) and x has compo- 
nents (2,, L, Ts). 

The identity transformation is that in which every vector remains 
unchanged; the matrix corresponding to this is 


0, 1, 0, | (10) 


21] GENERAL LINEAR TRANSFORMATIONS OF REAL THREE-DIMENSIONAL SPACE 75 


which is generally known as the unit mairiz and is denoted by the 
symbol J. 

Assuming D(A) # 0, we can solve equations (2) with respect to 
(2%, 2%, Zy) and arrive at the expressions 


= Dat Dia + Day 
A 1, A P A , 

=F 1 Dat DAs 3 (11) 
A, hoy A , A r 

= Day t Day? + Deas 


where the A; are the cofactors of the a, in D(A). This linear 
transformation is usually referred to as the inverse of (2), and if 
A denotes the matrix of (2), the matrix of (11) is written 4-1, We 
now introduce a concept of importance for what follows, that of the 
product of two transformations or of two matrices. Let us have two 
linear transformations, from (2,, 2,23) to (xj, 2g, £3): 


Hy = AyyX, + AyeF, + A437 


Tp, = Dy Ly + AggLy 1 Aygt, ¢ OT x= dx (12) 
y= AgyLy + Ago, + Aggts 
then from (2{, 22,23) to (2{, 23, 23): 
ay = By, X + B92, + dy gre 
23 = bey + baat + Bygag ¢ OFX" = Br’. 18) 


ah = byt + Baas + Bag 

These successive transitions from (z,, Z, %3) to (2{, 23, 2) then from 

(xj, 23, 23) to (2f, 73,23) can be replaced by a direct transition from 

(21, %, Ly) to (x{, 23, £3), this latter being also a linear transformation: 

Uj, = CyyLe + Cyo%, + Cygtg (kK = 1, 2, 3). (14) 

This last transformation is described as the product of transforma- 

tions (12) and (13), an essential point to notice being the order in 

which the transformations are carried out. We obtain (14) by substi- 

tuting from (12) in the right-hand sides of (13). This gives us expres- 

sions for the elements c;,, of the transformation product in terms of 
the elements of the original transformations: 


3 
Cin = > Oistsr (2, & = 1, 2, 3). (15) 
- g§=l1 
We usually write (14) as follows: 
x” = BAx. (16) 


76 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS (21 


The matrix C with elements cj, as given by (15) is called the product 
of matrices A and B and is written thus: 


C=BA, (17) 


where the product must be read from right to left in the sense of the 
order of carrying out the transformations. If we make use of (15) and 
the multiplication theorem for determinants, we can write the obvious 
equality: 

D(C) = D(B) D(A), (18) 
ie. the determinant of the product of two transformations is equal to the 
product of their determinanis. We can easily prove the following rela- 
tionship, which has a simple geometrical] meaning: 


AAI= A144 =]. (19) 
We also notice that it follows from the actual derivation of the in- 
verse transformation that the inverse of 4~! is the transformation A. 


For, on solving system (11) with respect to the 2%, we obviously again 
get expressions (2). We can write this as follows: 


(A) = A. (20) 


The concept of transformation product can be extended to the case 
of any number of factors; e.g. the result of successive transformations 
with matrices A, B, and C is a transformation with matrix D: 


D=CBA. (21) 
If matrices A, B, C have the elements 
in, Oy, and Ciz, 
the matrix D will have elements given by the expressions: 


3 


Gx = > Ciqbaptpx- (22) 


P,g= 
We have, in fact, for the clements of the matrix # = BA: 


3 
Cin = > bipApx 
p= 


and finally, for the element of CZ, by (15): 


3 
dix = es Cigak: 
g=l 


21] GENERAL LINEAR TRANSFORMATIONS OF REAL THREE-DIMENSIONAL SPACE 77% 


whence (22) follows. It must be mentioned that the elements of a 
matrix A will often.be written in future as 


{A}ix- 

Matrix products are not generally subject to the commutative law 
i.e. they change when factors are interchanged, so that in general e.g. 
BA # AB. They obey the associative.law, however, that is to 
say, their factors can be grouped: 


C(BA) = (CB) A. (23) 


On the left-hand side, we must multiply A by B, then multiply the 
result by C. On the right-hand side, we first multiply B by C, then 
multiply A by the result. It is easily seen that in both cases the 
elements of the matrix finally obtained are given by (22). This has 
already been proved for the left-hand side; we have for the right-hand 
side, on carrying out the successive multiplications: 


and . 
"3 
{(CB)4},= > 


pal P, 


{CBhip{ Aon = 2 (Chal Blap{ A} px 


1 


which is evidently (22) in our new notation. 
A further important type of linear transformation must be mention- 
ed, in which we have: 


xy — heya43 x3 = kX; x5 — kegtg, (24) 


and which amounts to extension (or contraction) along the coordinate 
axes, the extension being characterized by the numerical coefficients 
ky, k,, k,. The matrix of this transformation is obviously 
ky, 0, 0} 
a 
{ 
0, 0, Kz || 
i.e. all the elements not on the principal diagonal are zero. We refer 
to this type as a diagonal mairiz, and denote it by 


[A 2, ks]- 


In particular, if #, = k, = 4,, the transformation reduces to multi- 
plication of all the components of a vector by the same number & and is 


78 LINEAR TRANSFORMATIONS AND QUADRATIO FORMS [21 


evidently the transformation of similitude with centre at the origin. 
Every vector*’changes its length, which is multiplied by k, without 
changing its direction (we are assuming k& > 0). The following simple 
notation is used in this case: 


x’ = kx, 


ie. we look on the number & as a particular case of a matrix, and take 
it as in fact the diagonal matrix with the same element k on the prin- 
cipal diagonal: 


| 
|- (25) 
| 


It may readily be seen by using (15) that multiplication of such mat- 
rices reduces to the ordinary cross-multiplication of numbers: 


[k, &, He] -[2, 1, 2] = [H, kl, Ki]. 
It may easily be shown that, in general, the simple multiplication 


rule: 
[f1, hea, ks] 7 [1., lp, 1] = [k,4,, kel, kel}, (26) 


applies for diagonal matrices, i.e. two extensions along the coordinate 
axes are equivalent to a single extension with coefficients equal to 
the products of the corresponding coefficients of the component ex- 
tensions. An immediate consequence of (26) is that the product of two 
diagonal matrices is unchanged on interchanging the factors. By using 
(15) and representation (25) of a number as a diagonal matrix, it can 
easily be seen that the product 4A is obtained simply by multiplying all 
the elements of A by the number k. This product is independent of the 
order of the factors, i.e. 


{kA}, = {Ak}, =k{A}y- (27) 


We have regarded the basic linear transformation (2) as a deform- 
ation of space in which a vector with components (z,, 22, £3) becomes 
a new vector with components (xj, 23, 23). Of course (2) can also be 
interpreted, as already mentioned, as a point transformation in which 
@ point with coordinates (z,, Z, Z,) becomes a point with coordinates 
(21, %, 73). 

We could have used any system of axes, in other words, any funda- 
mental vector set, for defining vector components, i.e. we could have 
taken any three non-coplanar unit vectors i, j, k as the fundamental 


21) GENERAL LINEAR TRANSFORMATIONS OF REAL THRER-DIMENSIONAL SPACE 79 


set, in which case, as we know from [I], 102], any vector x can be 
expressed uniquely in the form 


X=2%,i+ 2,j + 2,k. (28) 


The numbers 2,, 2, 2, are called the components of x in the co- 
ordinate system defined by the fundamental set i, j, k. Our next task 
is to see the effect of a different choice of fundamental set on the form 
of the linear transformation. 

More precisely, if a linear transformation in the coordinate system 
defined by i, j, k is given by (12), what is the form of this same trans- 
formation of space in another coordinate system, defined by say i,, jj, 
k,? Let the new fundamental set be given in terms of the old by the 
expressions: 

i, =t i+ tejtiak 
ji = tit tej t tak (29) 
k, = ty i+ tej + tggk. 


It will be noticed that the determinant made up of coefficients tj, 
cannot vanish; if it did, i, j,, k, would be linearly dependent, i.e. 
coplanar. In the new coordinate system, the vector given by (28) will 
have new components: 


Hi, + Yj. + Yak. 
We first of all establish expressions for the new components in 


terms of the old. We obviously have, on substituting expressions (29) 
for the new fundamental vectors: 


3 
> Ys beri + tej + tak) = 21+ 2 j + xk. 
5=1 


We get expressions for the old components in terms of the new by 
equating coefficients of i, j, k: 


X= bY, + fo, Yo + fg Ys 
Ly = bye Y. + tye Ye + tse Ys (30) 
Dg = big Yr + bos Yo + t33 Ya- 


The first subscript remains unchanged along a row in the matrix of 
transformation (29), whereas the second subscript remains unchanged 
in the rows of the matrix of (30). The matrices thus differ to the extent 
of rows being replaced by columns. If 7 denotes the matrix of (29), 
the array of (30) is known as the transpose of 7 and is written 7*. 


80 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [21 
Expressions (30) may be written in the abbreviated form: 
(4, Fa, Xz) = T (y,, Yo, Ys); (31) 


where (2,, 2%, 3) are the three components of a vector with respect 
to the first fundamental set, and (y, y, Y3) the components with 
respect to the new fundamental set. Conversely, the new components 
may be expressed in terms of the old by 


(Ya» Yor Yar) = TO (x,, 2o, 1g). 


where J'*-1 is the inverse linear transformation to T*. This is gen- 
erally known as the contragredient of T. For brevity, we usé a special 
letter to denote the array corresponding to it: 


U=Te-1 (32) 


We can thus say that a change in the fundamental set in accordance 
with (29) implies that the components of every vector undergo a 
linear transformation with the matrix U defined by (32). Hence the 
two vectors x(%,, %, 7) and x’(z’,, 2’, Z’,) that appear in transformation 
(9) will have different components after transformation of the funda- 
mental set, these being given in terms of the original components by 


(Y1s Yo: Ya) = U (2. Za, Xs); (Yi, Yo Ys) = U (x, 23, 25). (33) 
Our problem is to establish the linear relationship between compo- 
nents (4%, Yo, Y4) and (y’;,y’s,y’;). We can pass from the former to the 
latter by the following method: we first pass from vector (¥;, Yo, Ys) 
to (2, %2, 23) with the aid of the matrix U-1, by (33). Then we pass 
from (21, %,%) to (z’, 2’, 2’) with the aid of matrix A of (9), and 
finally, from (2x’;, 2's, 2’,) to (y’1, y’a, y’,) With the aid of matrix U. 
We thus end up with the linear transformation: 


y' =UAU~'y. (84) 


This transformation is said to be similar to transformation (9), and 
its matrix U AU — is said to be similar to matrix A. 

Our result may finally be stated as: if the linear transformations of 
vector components due te a change in the fundamental set are given by 
expressions (33), any linear spatial transformation with the form in the 
original fundamental set: 

x’ = Ax, 


becomes in the new coordinate system: 
y =—UAU"'y. 


22) OOVARIANT AND CONTRAVARIANT AFFINE VECTORS 81 


22. Covariant and contravariant affine vectors.-Suppose that linear trans- 
formation (9) simply expresses the passage from one system of Cartesian 
axes to another, i.e. its coefficients are the direction-cosines given in table 
(3). In this case, as we saw in [20], the transpose A) is the same as tho inverse 
A~!, and the contragradient A@)-1 is therefore the same as the original matrix 
A, ie. 

AWM —=A1; AM 1= A, (35) 


If we take a vector of constant length and direction, we can naturally say 
that its components are transformed in accordance with the same expressions 
(9) as the coordinates, i.e. 


1 = yy Ty 7 Ay Ve T Azz Vy 
LO, = Ag, Ty 4 Agq We + Ae3 Ly (36) 


7 A A 
Zz = Az, Ty —- Age To - Ugg Zs. 


We can, therefore, say that a vector is completely characterized by three 
numbers in any fixed Cartesian system, and on passage from one Cartesian 
system to another the three numbers (vector components) are transformed 
in accordance with the same expressions (36) as the coordinates. Suppose that 
we now take into account not only passage from one Cartesian system to another, 
but all the generally possible linear transformations of coordinates with non- 
zero determinants which corresponds, as we saw above, to an arbitrary choice 
of three non-coplanar vectors as the fundamental set. As above, along with 
matrix A of transformation (36), we shall consider the contragradient V = 
= At, These are distinct in the general case, so that we have two possi- 
bilities for defining s vector in any linear coordinate transformation. In the first 
place, we can define a vector as a set of three numbers which is transformed 


on passage from one coordinate system to another by the same formulae as the 
coordinates themselves, i.e. by 


(71, 75, 3) = A (21, TZ: Z3)- (37) 


Such a vector is described as a contravariant affine vector, the general linear 
transformation (36) being sometimes referred to as an affine transformation. 
Alternatively, we can define a vector such that its components undergo the 
corresponding contragradient transformation for any linear transformation 
(36), ie. 

(21, ©, 73) = V (ry, Lg) La)- (38) 


Such a vector is known as a covariant affine vector. 

In both cases, given the components of a vector in any one coordinate 
system, we automatically obtain the components in any other coordinate 
system-which is derived from the original system by means of an affine trans- 
formation. Examples of both types of vector are as follows. The radius vector 
joining two given points in space is clearly contravariant, since its components 
in the above sense of the word (the differences between the coordinates of its 
end-points) are transformed in accordance with the same linear formulae as 
the coordinates themselves. Another example of s contravariant vector may 


82 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [22 


be quoted. We take the coordinates (z,, 2, Zz) of a point as functions of s para- 
meter ¢ and define a velocity vector with components 
( i Ss) 
di’ dt” de 

On differentiating the basic expressions (36) with respect to ¢t, we see at 
once that the velocity vector is contravariant. 

We now give an example of & covariant vector. Let f(x,, £,, Z;) be & function 
of a point in space; the gradient of the function in any given coordinate system 
is defined as the vector with components 


a i 
Or,’ Or,’ Or, ~ 
We have by (36) and the rule for differentiating functions of a function: 
of af of Byte gs 


ee, 8 Gaz ce a 3S Ory 


i.e. the components of the ears along the (a, 22, 3) axea are given in terms 
of the components along the (x}, z,,2,) axes by a linear hep aap with 
matrix A‘), whence it follows that the components along the (#}, x), 2 ) axes 
are given in terms of the components along the (z,, 2,73) axes ‘ts & * near 
transformation with matrix A@~-1 = V, i.e. the gradient of a function is in 
fact a covariant vector. 

Expressions (37) and (38) may readily be written in terms of the partial 
derivatives of the new coordinates with respect to the old, and vice versa. We 
first introduce 6 notation which is somewhat different to the above and is more 
usual in vector theory: a superscript is used for the components of contra- 
variant vectors and a subscript for the covariant components, the corresponding 
coordinates being themselves denoted by a superscript. 

The coefficients of transformation (36) can be written as follows in terma of 
the partial derivatives: 


2°) 
ay = me . (39) 
The elements of the contragradient matrix V become: 
— _ An 
De). 


and (A-1)) has the same elements, i.e. 
Al#)1 = (4-1)@), 
i.e. we can first pass to the inverse matrix then interchange rows and columns. 


On passage to the inverse matrix, the coefficient c, becomes d2/a2, and 
after transposition, we have for the elements of matrix V: 


ax”) 
Oz’( fay ‘ (40) 


Let the components of a contravariant vector be u‘? in coordinates 2“ 


Vg= 


22} OOVARLANT AND CONTRAVARIANT AFPINE VECTORS 83 


and «in coordinates 7, We have by definition: 


3 
ae’) 
uw’) — > ax u) (i = 1, 2, 3). (41) 
s= 
Similerly, we get by definition for a covariant vector: 
1 wey Oxh) 
u= er’) S (42) 
s=1 


It may be mentioned that these formulae can be used for defining the com- 
ponents of a vector not only on linear transformation of the coordinates but 
with the most general type of transformation, when the individual coordinates 
are expressed in terms of others with the aid of in general non-linear functions. 

We shall indicate another possible definition of covariant vector, when the 
contravariant vector is defined simply as the vector whose components are 
transformed in accordance with the same formulae as the coordinates, Let 
u® be a given contravariant, and v, a covariant vector. 

We form the sum: 

uv, + u@) vy + ul) vy. (43) 


This may easily be seen to remain invariable, or in otber words, to be a 
scalar, if « and »v, vary in accordance with the corresponding expressions 
(41) and (42). 

For the rule for differentiating functions of a function gives us at once: 


ie ed ee dO) 3 ax(D 
= was) ts = pe [2 aH uo [> ae “| = wu Dy + ute) Yq + ua) Vg. 
s=] 1 


s=1Lk=1 


Hence, having defined a contravariant vector by the method given above, 
we can find the transformation rule for the covariant components from the 
requirement that sum (43) remains invariable. An exact repetition of the work- 
ing of the previous section leads us to the conclusion that, given the invariability 
of (43), the components », must undergo linear transformation contragradient 
to that suffered by the components «. We suggest that the reader show that, 
for any (linear or non-linear) coordinate transformation, the velocity vector is 
& contravariant whilst the gradient of a function is a covariant vector. 

A distinction is worth noticing between contravariant and covariant vectors 
which have been defined in a purely formal manner above, by formulae for 
passing from one system to the other. Let x be a vector of given length and 
direction. Given the fundamental set, we form components in accordance 
with (28) and now refer to them as the contravariant components, (28) being writ- 


ten in the form 
x=20ji+ 2@j4+2@k, (44) 


The covariant component of x along i is defined as the rectangular projection 
of x on i, multiplied by the length of 1, and similarly for the other fundamental 
vectors. We thus have, for each fundamental set, three covariant components 
(<1, 2%, 2). It can be shown that these are transformed like the components 


84 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [23 


of a covariant vector on passage from one fundamental set to another. For it 
can be shown (we shall not dwell on the proof) that the expression 


2) zy > a) Zs > al3) x, 


here gives the square of the length of vector x and is therefore unchanged on 
transformation of the fundamental set. 


23. Tensors. We now turn to a generalization of vectors, only linear coordinate 
transformations being considered initially. Let the array of nine numbers 


biz (i, k = 1, 2, 3) 


be given in a certain coordinate system. 
We form the expression 


3 
by wo, (45) 
i,k=1 
where u and o™ are the components of two contravariant vectors. On passing 


to new coordinates, we can express the uw and v™ in (45) in terms of the new 
components «” and v’” and hence transform (45) as follows: 


3 3 
Pe diz uw) 4). big uD Ak) (46) 
i,k=1 i,k=1 


Now we still have an array of nine numbers, with elements bj,, in the new 
coordinate system. Such an array, defined in any coordinate system by requiring 
invariance of expression (45), is described as a covariant tensor of the second 
rank. Similarly, on taking two covariant vectors with components u; and », 
and forming the expression 


3 
; 2 : of 4) Uv Es (47) 
LiA= 


where the array of nine numbers 6 is specified in some given coordinate 
system, we obtain a similar array in any other coordinate system on requiring 
invariance of expression (47). We now have a contravariani tensor of the second 
rank. Finally, if we take a contravariant vector with components 2? and a 
covariant vector with components v; and form the expression 


3 
ST Huy, (48) 
i,k=l 


we arrive in precisely the same way at a mized tensor of the second rank. 

We now show how, given the coefficients of the linear coordinate transforma- 
tion (36), expressions can bo derived for the components of 8 tensor in the new 
coordinates in terms of its components in the old. We start by considering a 
covariant tensor of the second rank. The components wu and vo of the contra- 
variant vectors in the old coordinates are expressed in terms of the components 
wu’) and vo in the new coordinate system with the aid of a linear trans- 


23] TENSORS 85 


formation of matrix A~1. This gives us, on writing the elements of the matrix 
as {4°43,: 3 


3 3 
w— > { A-H,,u'; 2) — x { AY a. 
k=1 k= 


We substitute in (45), find the coefficients of the products u* v™”, and thus 
obtain for the components b}, of the tensor in the new coordinate system: 


3 
Win = : 2 by {A} 91 {AY (49) 


Similarly, for a contravariant tensor of the second rank, we have to express 
the components of the covariant vectors wu, and v, in terms of the new components. 
By definition of contravariant vector, u; is given in terms of u; by means of 
the array A@)-1, so that u; is given in terms of u via the array A‘*), the trans- 
pose of A, and similarly for »;: 


Substitution in (47) gives us the transformation for the components of a 
contravariant tensor of the second rank: 


3 
bE A = DB {A} in {A }ag- (50) 
P,q=1 
Similarly, we have the following transformation for the components of a 
mixed tensor of the second rank: 


3 


(0 — SP (ale (0 


If we express the coefficients of the linear transformation in terms of the 

partial derivatives 

an’) an 

an) ANS a 
and substitute these expressions in the above formulee, we get formulae for 
transforming tensors of the second rank in the case of any coordinate trans- 
formation. Analogous definitions to the above are possible for tensors of rank 
higher than the second, but we shall not dwell on this. 

We have constantly been concerned above with matrices expressing linear 
transformations of three-dimensional space with given coordinate axes. Let 
such @ matrix be B and let an affine coordinate transformation have been carried 
out in accordance with 

Yas Yor Ya) = A (21, Ty Za); 


where 4 is a matrix with a non-zero determinant. As we have shown above, 
our spatial transformation has the matrix in the new coordinates: 


ABA". 


86 LINBAR TRANSFORMATIONS AND QUADRATIO FORMS {24 


It is easily seen that the transformation worked out above for a mixed 
tensor of the second rank has the same matrix. For, on applying the rule for 
matrix multiplication, we have 


{BA }q = 2 {Bhp {Apis 
followed by: 
{4 (BA™)}g = 2, {A}ig {BA qi = i) Bhar {4s {4} ag: 


If we write 5{) instead of {B},, we have an expression of the same form as 
(51). The matrix of a linear transformation of space is thus a mixed tensor of 
the second rank. 

Some tensors of a particular kind must be mentioned. Let a covariant tensor 
in a given coordinate system have the property that, 


by=by (ik =1,2,3). (52) 


_ It may easily be seen to have the same property in any other coordinate 
system. For, by (49): 


3 
Oki = Bi bpg {A }ok {4 het 
or by (52): 


3 
big = bap {An {Aa 
P,q=1 
or, on changing the notation for the variables of summation: 
3 
bi = : = PP {A Fee {4p » 


whence it is clear that bj; is in fact the same as b;,. We describe this type as 
& symmetrical covariant tensor. A symmetrical contravariant tensor can be defined 
in exactly the same way. Similarly, if by, = —b, or 6!) = —b%) in one 
coordinate system, the same will be true in any coordinate system, and the 
corresponding tensor is said to be skew-symmetric. The same situation does not 
hold for a mixed tensor, so that, e.g., 6 = bf) is not an invariant relationship 
on transformation of the coordinates. We shall consider next some particular 
cases of tensors. 


24, Examples of affine orthogonal tensors. We confine ourselves in future 
examples to the linear coordinate transformations discussed in [20], which 
correspond to the passage from one Cartesian system to another, and which are 
generally known as orthogonal transformations of three-dimensional space. 
The contragradient transformation A“)! coincides with A for these, as we 
have already seen, and the distinction between contravariant and covariant 
vectors disappears. Similarly, it is clear that we now have just the one concept 
of second rank tensor. If we write {4}, for the matrix coefficients of an ortho- 


24) EXAMPLES OF AFFINE ORTHOGONAL TENSORS 87 
gonal coordinate transformation as above, we have the following formula for 
transforming a tensor of the second rank: 


3 
bin = rt Fi bpg {A}ip {A}uq , (53) 


which follows at once from the expressions of the previous section. The elements 
of a column of || 6, || will be looked on as vector components. Hence we have 
the three vectors: 


bo) (811, be, Bg1)5 be) (B12) bez, b32)5 b() (Byg> Deg, Bsa) - 


We shall say that the first of these corresponds to the x,, the second to the 
Z,, and the third to the z, axis. We now relate a vector b™ to any direction 
n in accordance with the formula: 


b™ — cos (n, 2) b) + cos (n, z,) b® + cos (n, x3) b@. (54) 


We next replace the original Cartesian system (%,, 2, 2%) by (a1, Lys =) 
and use (54) to form the vectors corresponding to the new directions of the 
axes; 

b) — cos (x, ,) BO) + cos (xj, 2) b® +- cos (xh, 25) b. (55) 


On taking the projections of these vectors on the new (z}, x), a.) axes, we 
get an array of nine numbers |j bj, || analogous to || by ||. We show that the 
elements of the new array are given in terms of those of the original array precise- 
ly by the formulae for transforming tensors of the second rank. For, taking 


say the element b/,, this is by definition the component of the vector b’® along 
the new 2/ axis, and (55) gives 


be) = cos (23, x,) BO +. cos (ary, 22) B® +- cos (x3, £5) bO, (56) 


so that hb” is clearly a linear function of the vectors bh‘. All we have to do to 
get b/, is to replace the b on tho right-hand side of (56) by their projections 
on the a{ axis, ie. by the following expressions: 


bY — by by; cos (ay, z,) + be; cos (24, @) + b,j COS (Z},25) (i= 1, 2, 3). 
We now notice that, in accordance with table (3): 
cos (Zjs Zz) = ai, = {A} in. 


We have on making these substitutions for the vectors on the right-hand 
side of (56): 


3 
ba = 2 a 4 bog {A}ip {A} 2q 


which is precisely the same as (53). We can therefore assert that, if three vectors 
b'), b®, bh are defined for three mutually perpendicular directions and a 
vector for any direction (7) is defined by (54), the array of the nine numbers 
giving the projections of the vectors b~” (k = 1, 2, 3) on the 2™ axes in any 
Cartesian system defines an affine orthogonal tensor of the second rank, i.e. 
a tensor of the second rank, defined for all possible orthogonal transformations. 


88 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [24 


It must be noted that, when we speak of b” corresponding to the direction 
of a given z, axis, this does not mean that hb’? must be directed along the x, 
axis. What is essential is expression (54), which relates the vector b™ to any 
direction (7), where the direction of the vector does not in general coincide 
with (n). 

Two examples of affine orthogonal tensors of the second rank may be quoted. 
The first of these is the stress tensor, familiar in the theory of elasticity. We 
consider an infinitesimal surface do with normal (n) at a fixed point M of an 
elastic body under deformation. The action on the surface of the part of the 
elastic medium lying on the side defined by the normal direction is taken in 
the theory of elasticity to be equivalent to the product of a vector b”, dependent 
on. the direction (x), with the magnitude do of the area. By considering the 
equilibrium conditions for an infinitesimal tetrahedron cut from the body, 
we arrive at equation (54), which shows at once that the stress is a tensor of 
the second rank. This tensor is given in any Cartesian system by an array of 
nine numbers ||}, !|, the tensor being symmetrical, i.e. by = by;, as shown 
in the theory of elasticity. In other words, the projection on the z,; axis of the 
stress acting on an area perpendicular to the z, axis is equal to the projection 
on the 2; axis of the stress acting on an area perpendicular to the 1; axis. 

We now turn to the second example. Let €(M) be a vector field. If we choose 
a Cartesian system (2, a, £3) and take the derivatives of the field components 
(c,, Cz, ¢;) with respect to the coordinates, we get the following array of nine 
quantities: 


—__ il 


£ 
oP 
P 


ee Be (57) 
i Oz," Or," Or, 
" Qc, Ges Oe 
ff Oz, , Or, : Oz, 


We define a corresponding vector 0c/8n for any given direction (n); for in- 
stance, the elements of the kth column of (57) give the components of the vector 
that corresponds to the z, axis. We have the expression, for any direction 
(n) (II, 108): 

i = cos (7, Z,) ss + cos (7, 2) oi + cos (”, Z3) ae (¢ = 1, 2, 3), 
1 2 a 


Le. array (57) defines a tensor of the second rank. This tensor is in general neither 
symmetric nor anti-symmetric. But it is readily expressible as the sum of a 
symmetric and anti-symmetric tensor, where the sum of two matrices is under- 
stood to mean the matrix consisting of the sums of corresponding elements. 

We shall first make a general preliminary remark. The linearity of expression 
(53) implies that, if || 6,, {| and || cy || are two tensors, the sum || 6% + 
+ ¢y, || is likewise a tensor. Furthermore, the same formula remains valid 
on interchange of the subscripts, i.e. 


3 
bki = es bop {A}ip {A} ig ’ 


25) THE CASE OF n-DIMENSIONAL COMPLEX SPACE 89 


so that if a matrix defined for any axes yields a tensor, its transpose likewise 
yields a tensor. Suppose now that we are given the tensor || },, |]. 
We can write this as a sum: 
il 


The first term on the right is clearly a symmetric tensor, and the second term 
an anti-symmetric tensor. 

On applying this decomposition to the tensor defined by (57), we get for 
its symmetric part: 


Pe Fe | 4 [Pa Pe bye || 


it Dax |] = | ee 


| &. HERR B) 
F Or,’ 2\Gr, ' Ox, )’ 2 \ Ory On, I 

YL (de, , Oy acy 1 (Oy | Oe, \ :! Z 
latte) Ga, ’ oat at) # (58) 
“1 (8c, , Oy) 1 (By , Bey ae, j 

ala t =a 2 (a zz} Or, { 


If we have deformation of a continuous medium and YC is the displacement 
vector, ie. the vector giving the displacement of a point M of the medium, 
matrix (58) defines the so-called deformation tensor. The anti-symmetric part is: 


P Tee =) aye =i) 
: ¥ ae, ~ Gz, ) Fe, ~ Be, ) |} 
i i ( ac, dc, 1 ( ®, de, ) .. 7 
Ista) = 8 eae) i. 
eae _ sx), 3 +(e - a | 0 
1 2 Voz, Or, }’ Or. or, J’ ti 


We had an example before of splitting a tensor into two parts, in the particular 
case of a linear homogeneous deformation [I], 113], when we saw that the anti- 
symmetric part corresponded to a rotation of the space as a whole (without 
deformation) about a certain axis. 


25, The case of n-dimensional complex space. We now turn to the 
general case of n-dimensional space. We have already defined a vector 
in such space as a sequence of 7 real or complex numbers [12]: 


K (y, Bq, «+, By); 


the numbers being referred to as the components of x. We shall 
assume that the space is referred to a definite fundamental set 


a (1,0,...,0); a®(0,1,...,0);  ...; aM 0,0, ..., 1), 
so that 
x=2,a)+e,a4... 12, aM. (60) 


Vector equality and the elementary operations on vectors were de- 
fined by us in [12]. 


90 LINHAR TRANSFORMATIONS AND QUADRATIC FORMS [25 


We define a linear transformation of n-dimensional space as the 


passage from x(2,, %, ...,%n) to y(%, Yo: ---: Yn) in accordance with 
the formulae: 
Y; = By @ T Ag l, +... Aye, (t= 1,2, ...,%), (61) 


or alternatively: 
y = Az, (62) 


where A is the matrix || a;, ||? of the transformation. If its deter- 
minant D(A) differs from zero, transformation (62) is said to be non- 
singular, and A is a non-singular matrix. In this case, on solving 
equations (61) with respect to the z;, we get the inverse transforma- 
tion to (61) or (62): 

x=Aly, (63) 


where the matrix A-! has elements 
= Ag 
{4 i = Day ; (64) 


D(A) being the determinant of matrix A and A, the cofactor of its 
element di,. 

The definition of the product of two transformations is also ana- 
logous to the previous definition [21]: successive application of the 
two transformations 

y= Ax; z= By 
is equivalent to the single linear transformation 
z= BAx 


which is called the product of transformations A and B, and the 
matrix of which is given by 


{BA} = J (Phu A}e- (65) 


The product in general depends on the order of the factors, i.e. we 
have, apart from exceptional cases: 


BA#ZAB. 


The definition of product is readily extended to the case of any 
number of factors, the associative law being applicable here, i.e. 
factors may be grouped: 


(CB) A =O(BA). (66) 


25] THE CaSB OF n-DIMENSIONAL COMPLEX SPACE - 91 


The inverse transformation satisfies the relationships: 
AX te AAS Ie {ASS SH Ay (67) 


where I is the so-called unit matrix whose elements are unity on the 
principal diagonal and zero elsewhere. The unit matrix corresponds 
to the identity transformation 


4¥;=T; (¢=1,2,...,”). 


We define a diagona] matrix of order 7 as above: 


il Hi. 0, 0, 0 | 
HO, dea 0; hag | 

(ky, hy, . | 0, 0, ky, ---, 0 | (68) 
lo, 0, 0... & | 


This corresponds to the transformation: 


Y; = ky 2; (¢=1,2,...,2). 


The product of diagonal matrices is independent of the order of the 
factors and is given by 


ebb akc =Total (esha 
= {kil balp ---sbntal- 


In the particular case, k, = k, = ... =k, = k, we get the matrix 
i| &, 0, 0, ..., 0 
0,280 asics 0 

Peace 00 Byrcady OMe (69) 
a < . : : Ss | 


corresponding to multiplication of all the components of a vector by 
the number &. We shall take matrix (69) for the simple number , i.e. 
a number as a particular case of a matrix, which corresponds to what 
we said towards the beginning of this article. It may easily be seen 
by using (65) that the product of a number k, treated as matrix (69), 
with any matrix A is independent of the order of the factors and re- 
duces to multiplication of all the elements of 4 by the number k: 


{(t, kb, ..., kl A}in = {kA} = Bf Alin - (70) 


92 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [25 


Now suppose we have taken a new fundamental set b in place 
of the a“ above, the new set being expressed in terms of the a“ by 
the formulae: 

b® — t,,aY 4-t,,a@ + ... + ta | 
b@ — t,,aY 4-¢,,a2 +... + t,,a™ (71) 
bO—t, aD + ¢,a2 + ...+1,,a%, 
where the determinant made up of elements ¢;, does not vanish. We 
can now, conversely, write the a“ linearly in terms of the b™, and 
any lincar combination of the a“ is at the same time a linear com- 
bination of b®, and vice versa. In other words, the b™ taken as the 
fundamental set form the same space as the a“. If a vector x has 


components (21, ...,Z,) in the system of coordinates defined by the 
fundamental set a“, in the system defined by the fundamental set 
b™ it will have different components (2j,..., 24), these being express- 


ible in terms of the previous components by means of the linear 
transformation contragredient to transformation (71), which can be 
written as 
(aj, ..-,@1,) = TOM! (2... Ep), (72) 
where 7’) is the transpose of matrix 7 of (71). 
If we have a spatial transformation given by (62) in the original 
coordinate system, it will be given in the new coordinate system by 


y’ = UAU-x’, (73) 
where 
U = T@)-1, 
The matrix 
UAU-1 


is said to be similar to A. 

The basic concepts in the above discussion are those of vector and 
matrix. Sometimes the vector x(z,,...,%,) is itself regarded as a 
matrix, where one column, no matter which, consists of the numbers 
%, +++, En, whilst the remaining columns are filled with zeros. Suppose, 
for instance, that we put the vector components in the first column; 
then the vector becomes in matrix form: 


26] BASIO MATRIX CALCULUS 93 


Such a matrix in which only the elements of one column are non- 
zero, is occasionally denoted by 


zy | xz, 0, 0 i 

| |}m 0... OF, (74) 
4 is dB, leben SOE JS te i 

Zn oe, 0,-.-, 01 


We now show that linear transformation (62) can be written as the 
product of matrix (74) and the transformation matrix A. We multiply 
matrix (74) by A in accordance with rule (65) and use the fact that 
only the elements of the first column of (74) are non-zero; this means 
that only the elements of the first column of the matrix product 
differ from zero, whilst it is easily seen that the non-zero elements are 


Yj = AynXy T Ajg®y +--+ TAX, 


i.e. they in fact give linear transformation (62). We can thus write (62) 
in the form: 


yy ty 
Ye )= al" |, (75) 
"Yn Ln 


where the right-hand side is the product of two matrices. 
We conclude the present section by noticing again the general laws 
obeyed by operations on vectors in n-dimensional space: 


ETry=yrxE;, (E+y)72=x+(y+z). 
If x and y are any two vectors, the vector z= y — x is unique, 


has components (y; — 2,), and satisfies x + z= y. 
Let a and 6 be any numbers. We have: 


(a+6)x=ax+ bx; a(bx)=(ab)x; a(x+y)=ax+ay. 


We have 1x =x for the number unity, and 0x = 0, where the 0 
on the right-hand side denotes the vector, all the components of 
which vanish. 


26. Basic matrix calculus. Earlier sections have contained formulae 
in which the matrix appears as a new symbol, on which a number of 
operations analogous to those on ordinary numbers could be carried 
out. This naturally suggests the idea of constructing a new algebra 
which would be satisfied by symbols representing matrices. In other 


94 LINEAR TRANSFORMATIONS AND QUADEATIO FORMS [26 


words, we propose to regard a matrix as a new type of number, or as 
a hypercomplex number. Just as we arrived previously with the aid 
of two real numbers at the idea of a complex number of the form 
a -+ ib, so now we arrive at the new number, the matriz, with the aid 
of n? complex numbers a;,, arranged as a square array. An essential 
difference must be pointed out, however. We have seen that all the 
formal operations of the algebra of real numbers can be carried out on 
the letters symbolizing complex numbers. The same cannot be said as 
regards matrices. Matrix algcbra derives its fundamental difference 
from the algebra of complex numbers from the non-commutative 
nature of multiplication, ie. a product depends on the order of the 
factors. We now propose to establish the basic rules of matrix algebra; 
the results already obtained when regarding a matrix as the array 
of a linear transformation will be used as a guide in most relationships. 

We shall consider square matrices of the same order n throughout 
what follows, unless some special remark is made. We use the same 
notation as above, of {A};, for the elements of the matrix A. 

Two matrices A and B are reckoned equal when, and only when, 


{Ala={Bha (i,k, =1,2,...,27), (76) 


i.e. when all corresponding elements are the same. 
A matrix sum is defined by the formula: 


{A+ B},={A}at{Bha: (77) 


i.e. corresponding elements are added. 
Multiplication is defined by 


nm 
{BA}n= SB hel A he (78) 
s=1 
As we saw above, in general 
BA + AB. 
though the associative law is valid [21]: 
(CB) A=C (BA). (79) 


The determinant of a product is equal to the product of the deter- 
minants of the matrix factors: 


D(BA) = D(B)-D(A). (80) 
The distributive law is clearly also valid: 


(44+ B)C=AC+ BC and C(A~B)=CA+CB. (81) 


26] BASIC MaTRIX CALCULUS 95 


A special feature of matrix multiplication should be noted: a pro- 
duct can be zero, i.e. all its elements can vanish, although none of the 
individual factors vanish. The example may be quoted of the two 
identical second order matrices: 


| O|. 0, as 0 | 

[nol of fo, of 

The concept of the inverse matrix A-! is brought in exactly as in a 
previous section; A must be non-singular, ie. D(A) #0. If C= BA 
and Ra, Rg, Re are the ranks of A, B, C, we saw in [7] that 
Ro < Ry. If B is non-singular, 4 = BC, and we can say as 
above that Ra < Re, and consequently, Rc = Ra, ie. the rank of a 
matriz A is unchanged on multiplication on the right or left by a non- 
singular matrix B. We have the relationships for the unit matrix I: 


BI=IB=B, (82) 


where B is any matrix. 
It may easily be seen that 4-1 is the unique solution of the equations 


AX=I and XA=—T, (83) 


where J is the unit matrix. For, on multiplying say the first equation 
on the left by A-1 and taking (79) and (67) into account, we get 
X = A-1, and similarly for the second equation. It must be noticed 
that (83) has no solution whatever if D(A) = 0, i.e. no inverse of A 
exists. For otherwise, (83) would give us 


D(A) D(X)=1, 


which contradicts the condition D(A) = 0. 

The concept of diagonal matrix should be recalled from the previous 
section, as also the fact that any number k& can be regarded as a parti- 
cular case of a matrix. We can easily bring in positive integral powers 
of a matrix: 

AP=A-A...A. 


Negative integral powers of a matrix are introduced as positive 
integral powers of the inverse matrix, i.e 
AP = (AnIP, ; (84) 
We obviously have 
A-? = (AP)-1, ie. ATP AP = APA-P = 7]. (85) 


96 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [26 


The symbol for the quotient of two matrices: 
ane 
B 
does not have a definite meaning. We can interpret it in two ways: as 
the product AB-', or as B-14; these products are in general distinct, 
and it is only in the particular cases when they coincide that the 
quotient symbol has an exact significance. 
A further basic concept is that of similar matrices which we also 
introduced in the previous section. We shall note a number of formulae 
which are very easily proved: 


(CBA) 1=A7B (C4, (86) 
CBAC— = (CBC~}) (CAC). (87) 

If A® denotes the transpose of any matrix A, we also have: 
(CBA) = AM BO CO, (88) 


which is easily verified by using the definition of product. Two new 
notations must be introduced. We shall write 4 for the matrix whose 
elements are the conjugates of the elements of A, ice. 


{A} in = {A} ie (89) 
a symbol of the type a being used as usual to denote the conjugate of a. 


Lastly, we shall write 4 for the matrix obtained by interchanging rows 
and columns in A and replacing all the elements by their conjugates, i.e. 


{hie = (Ahuv (90) 
The matrix 7 is generally called the Hermitian or Hermitian con- 


jugate of matrix A (due to Hermite, a French mathematician of the 
latter half of the nineteenth century). It may easily be verified that 


ee ~nmne 
CBA=ABC. (91) 
We suggest that the following elementary formula also be verified: 
(A) = 4-10), 
i.e. the signs of the inverse and transpose can change places, as already 
mentioned in [20]. 
We notice an expression which will be useful later. It follows at 
once from (67) that 
. D(A) D(A) = 1, 
i.e. 
D (4-4) = D(A). (92) 


26) BASIC MATRIX CALCULUS 97 


In other words, the determinant of the inverse matrix has a value 
equal to the reciprocal of the determinant of the original matrix. 
The concept of diagonal matrix may be generalized to that of quast- 
diagonal matrix. This will be explained in a particular case. Let us take 
the seventh order matrix: 
yy, Dyas by3,0, 0, 
Bai» B92, 9a, 0, 0, 
bs1) O32: B93,9, 0, 
0, 0, O, ey, ey. 9, 
0, 0, 0, ¢y4, Coz, 0, 
0, 0, 0, 0, 0, dy, dy. 
0, 0, 0, 0, 0, dy, des 
Let B denote the third order matrix with elements b,,, and C and D 
the second order matrices with elements c;, and dy, respectively. The 


above seventh order matrix is called a quasi-diagonal {3, 2, 2} struc- 
ture and is denoted by 


coos 
cocoa o 


[B, C, D}. 


We suppose in general that the principal diagonal of an nth order 
matrix, made up of the elements a;;, is divided into m parts, the first 
part consisting of the first k, elements, the second of the next k, 
elements, and so on, so that 4, + ... +k, = 7. We cam regard the 
first k, elements as the principal diagonal of a matrix X, of order &,; 
the next k, elements as the principal diagonal of a matrix X, of order 
k,, and so on. Suppose that all the elements of our matrix A not 
belonging to matrices X, are zero. Then A is known as a quasi- 
diagonal matrix of structure {k,,..., &,} and is written symbolically: 


A(R i RO 


The rules for operating on quasi-diagonal matrices of the same 
structure are of unusual simplicity. We shall state the relevant for- 
mulae and omit the proofs: these are based on the definitions of the 
operations and are purely elementary. We have for addition of quasi- 
diagonal matrices of the same structure: 


[XA os. el PLY Pawto lt al = 

=[X,+ ¥,%,4+ Yo, .-.;X_+ Yah (93) 
where the fact of the same structure implies that every matrix X, 
is of the same order as the corresponding matrix Y,. Similarly, we 


98 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [27 


have for multiplication, and for raising to a power: 
[¥y, Yo, -- +» Pm) (Xi, Xe ---) Xm] =(%1 Xp FaXe --.) Vn Xn), (94) 
[X,, Xo, ..., Xm]? = (XY, XG, ..., KF), (95) 


where 7 is any positive or negative integer, except, of course, that none 
of the determinants D(X,) must vanish if p is negative. 

The rule for the similarity transformation of a matrix [X,, X,...,Xm] 
with the aid of a matrix of the same structure is given by 


[Vie eV Vik ela kPa LS 
SF Xe ¥on sn ¥. Xe7R1. (96) 


The geometric interpretation of the linear transformations supplied 
by quasi-diagonal matrices is seen as follows. We take for simplicity 
the seventh order matrix used above, the structure of which is defined 
by the numbers {3, 2, 2}, and we consider the corresponding linear 
transformation. If we have in the original vector (2,, ..., 2%): 


%=%=—2;,=—%,=—0, 
obviously, in the transformed vector: 

Yq = Ys = Yo = Yz = 0. 
so that, in fact, all the vectors belonging to the subspace formed 
by the first three fundamental vectors belong to the same subspace 
after transformation, and the transformation will itself be defined 
by a third order matrix B. The same applies to the subspace formed 
by the next two fundamental vectors, and similarly, to that formed 
by the last two. 


It may be recalled here that the subspace formed by vectors x) 
x is defined as the system of vectors given by 


6 xO +... 6x0, 


where c, ...,¢ are arbitrary constants. 


27. Characteristic roots of matrices and reduction to canonical form. 
Though similar matrices are obviously not equal in the sense of (76), 
they are equivalent in the geometrical sense inasmuch as they embody 
the same linear transformation of space, expressed in different co- 
ordinate systems. We shall now look for the invariants of these mat- 
rices, i.c. the expressions made up of elements which would have the 
same values for all similar matrices. One invariant is easily found. 


27) OHARACTERISTIC ROOTS OF MATRICES AND REDUCTION TO CANONICAL FORM 99 


This is the determinant of the matrix. For, given A, let UAU-! be a 
similar matrix, where U is any matrix with non-zero determinant. 
We have by (80) and (92): 


D(U AU~) = D(C) D(A) D(U>) = D(U) D( 4) D(U) 1 = D(A). 


We form another invariant by taking the polynomial (A) of degree 
nm in a parameter 2, equal to the determinant of the matrix obtained 
from A by subtracting A from each diagonal element, i.e. by taking 


Gy, —A, Aya, ++ +5 Ayn 
g (A) = F221 F22 — A, «+++ Gan , (97) 
Qn, nes 1 Onn — A 


where the aj, are the elements of A. We can write this alternatively as: 


g(A)=D(A—-A=D(A-—AN, (98) 


since A or AZ is a diagonal matrix by hypothesis, in which all the ele- 
ments on the principal diagonal are equal to 2. On replacing A by 
UAU~— and noticing that commutation of 2 with any matrix is pos- 
sible, so that UAU-1 = 4, we have: 


D(UAU-1—A4)=D [U (A —A) U-} = D(A —A) 
whence 
D(UAU4—A=D(dA—A). (99) 


Hence we see that polynomial (97) is the same when formed for 
U AU-1as when formed for A. In other words, all the coefficients of 
(97) are invariants with respect to a similar matrix. The final coeffi- 
cient is obviously (—1)". We shall pay particular attention to the 
constant term and the coefficient of (—1)"~* 4"~*. The former is clearly 
the determinant, an invariant that we have already mentioned. The 
latter may be seen, on using the results of [5], to be equal to the sum 
of the diagonal elements. This sum is generally called the trace of the 
matriz, and is written as follows: 


Tr (A) = {A}, + {A}or +... + {Abn = Gir + Gag +--+ + Onn 


where Tr is occasionally written Sp, from the German ‘“‘Spur’’ (meaning 
“trace’”’). Similar matrices thus have the same determinant and the 
same trace. 


100 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS (27 


We now write the equation 
D(A—A)=0, (100) 


known as the characteristic equation of matrix A, its roots being called 
the characteristic roots, or eigenvalues, of A. We can say from the 
above that similar matrices have the same characteristic roots. An 
equation of the form (100) has already been encountered. 

We now pose the question: is there a matrix V such that, on carrying 
out the similarity transformation on a given matrix A, the resulting 
V-1AV is a diagonal matrix? Or alternatively, expressed from the 
point of view of linear transformations of space, is it possible to choose 
coordinate axes such that a linear transformation characterized by the 
matrix A in the original coordinate system reduces in the new system 
simply to a transformation of the form y, = 4; 2,2 We should remark 
that the fact of writing V-!AV instead of the previous UAU— is of 
no real consequence. 

We can write down our condition as: 


V-l AV =[4, A, .- +. Ins (101) 


where it is required to find the elements of V and the numbers /,. 
By multiplying both sides on the left by V, we can evidently re- 
write the condition as 


AV =V [A Ay ---)4pl- (102) 


We now use (65) to find the elements on both sides with sub- 
scripts ¢ and k. This gives us n* equations: 


fr] 


Big Vsq = Vin Ags 


s=1 


where @;, and vy, are the elements of A and V. 

We fix the second subscript & and put 7 = 1, 2, ..., which gives 
us nm equations containing only the number 4, and the elements 
O1ks +++, Uny Of the kth column of V: 


n 
D> Gis %se = An Vin (= 1,2, ...,2). (103) 


s=l 


If we take the elements (2,,, ..-, Yn.) a8 the components of a vector 
v“), we can write the above set as the single vector equation: 


Av = Av, (104) 


27] CHARACTERISTIC ROOTS OF MATRICES AND REDUCTION TO CANONICAL FORM 101 


Hence the discovery of the matrix V that reduces A to the diagonal 
form amounts to finding the vectors v™ that are reproduced identi- 
cally except for a numerical factor as a result of the linear transfor- 
mation defined by A. This is the algebraic analogue of the position 
in present-day quantum theory, according to which Heisenberg’s 
matrix mechanics is in essence equivalent to Schrédinger’s wave 
mechanics. From the former point of view, the basic problem is that 
of reducing an (infinite) matrix to the diagonal form. As regards wave 
mechanics, the essential problem here is that of finding vectors (in 
space with an infinite set of dimensions) such that they are identically 
reproduced except for a numerical factor as a result of a linear trans- 
formation. We have referred to the above discussion as the algebraic 
analogue inasmuch as the problems are reduced to purely algebraic 
problems by confining ourselves to space with a finite number of 
dimensions. The more complex case of space with an infinite set of 
dimensions requires an essentia] departure from ordinary algebra 
and makes use of the apparatus of analysis. All these questions are 
treated in detail later, though it may be mentioned meantime that 
applications to physics in the case of a finite number of dimensions 
require only matrices A of a particular type (Hermitian matrices for 
which aj, = Gj) and matrices U likewise of a definite type (unitary 
matrices, the definition of which is given below). Although we shall con- 
sider here the general problem for any finite matrix, we confine our- 
selves to the statement of final results without full proofs. A full 
solution will only be given in the case of problems of practical interest. 

We turn to the solution of system (103) or (104). This becomes, 
when written out in full: 


(241 — Ag) Myy + yz Voy +--+ + Gin Un, = 0 
Gay Vy 1 (Boa, — Ay) Vg, +--+ + Gen Ppp = 9 (105) 


The necessary and sufficient condition for obtaining a non-zero 
solution for (v4, ..., nx) is the vanishing of the determinant of the 
system, i.e. A, must be a characteristic roots of A. We shall only treat 
in detail the case when the characteristic roots are distinct. Let the 
roots be 4,, ..., 4,. On substituting the first root 4, in the coefficients 
of system (105), we can find from this the elements of the first column 
of V. We shall not go into the question of how wide is the 
choice of the vy,. We choose a solution of the system in any uniquely 


102 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS (27 


defined manner with the only proviso that it be non-zero. Similarly, 
on replacing 4d; = 4, in the coefficients of (105), we can find the ele- 
ments of the second column of V, and so on, as far as the nth column. 
Equations (105) are equivalent to (102), and all we need in order to 
arrive at the basic equation (101) is the existence of the inverse V-? 
of V, i.e. non-vanishing of the determinant of V. We prove this latter 
property by assuming the contrary, i.e. let the determinant vanish. 
As we know from [12], this is equivalent to the existence ofa linear 
relationship between the vectors v“ defined by the columns of V: 


Cv) 4... 46,1 =0, 


where not all the coefficients C,, are zero. We apply the transformation 
defined by matrix A (n — 1) times to both sides of this equation. 
Using (104), we have the n equations: 


CyvO + Cv = 0 
4,0, vOH...4 4,6,¥%=0 


ABC, vO 4 ...4 47106, vO = 0. 


Since not all the vectors C, v“ vanish, we can say that the deter- 
minant of the system must vanish: 


td, 1, il | 
3 A Ay , A, i= 0, 
| ant, qact. : An-l | 


where the numbers 4; are distinct by hypothesis. But this last equation 
contradicts the fact that the Vandermonde determinant of distinct 
numbers cannot vanish. We have thus proved the possibility of re- 
ducing the matrix by means of the similarity transformation to the 
diagonal form in the case when all the characteristic roots of the matrix 
are distinct. When some characteristic roots are equal, it may happen 
that the matrix cannot be reduced by a similarity transformation to 
the diagonal form. None the less, there exists in this case a simplest or 
canonical form of the matrix. In the case when the matrix reduces 
to the diagonal form, the canonical form becomes 


[A Ay, --- 5 Ad], 


27)  OMARACTERISTIC ROOTS OF MATRICES AND REDUCTION TO CANONICAL FoRM = =—-108 


where 4, are the characteristic roots of the matrix. We shall merely 
state the result in the general case.t Let 4 = a be a root of equation 
(100) of multiplicity &. Further, let 4 = @ be a root of multiplicity k, 
but not more than #,, of all the (n — 1)th order determinants of the 
array on the left-hand side of (100), i.e. every such determinant is 
divisible by (4 — a)", but at least one is not divisible by (A — a)*1. 
Similarly, let all the (x — 2)th order determinants have 4 = a as a 
root of multiplicity %,, but not more than &,, and so on, till finally the 
same root is of multiplicity %,, for all the (xn — m)th order determinants, 
whereas at least one of the (n — m — 1)th order determinants is non- 
vanishing for A = a. This last will evidently be true for the successive 
lower order determinants. It can be shown that the sequence of numbers 
k, is decreasing, i.e. 


k>k>k >... >Km- 


We bring in the following positive integers: 
h=k—kyy =k —h---3 Int =k 


where obviously, 2; + 1, + ... + lay, =k. 
The expressions: 


(A—a)jh; (A—a)'}...5 (A—aylan 


are known as the elementary divisors of matrix A corresponding to the 
root A = a. We can similarly find the elementary divisors for all other 
characteristic roots of 4 and hence obtain the set of elementary divisors: 


(A—Aje; (A— Ay; 2.23 (A AD), (106) 


where 
OQ: t+O,+...-+o,=n (107) 


and not all the 2, need be distinct. 

We saw above that the characteristic roots are unchanged by a 
similarity transformation. The set of elementary divisors of a matrix 
happens to possess the same property. We now introduce some new 
elementary matrices I,(a), where the symbol represents the matrix 
of order o@ in which a is repeated down the main diagonal, unity is 


t A proof will be found in a special note at the end of Part 2 of this 
Volume. 


104 LINEAR TRANSFORMATIONS AND QUADRATIO FORMS [27 


repeated down the diagonal below, and the remaining elements are 
Zero: 


| a, 0, 0,..., 0, 0] 
1, a, 0, . 0, 0} 
i! 
I, (a) = 0. Tyas. 0, al! (108) 
Pe eee rene i 
1050; 0; 45.4608 
lo, 0, 0,.... 1, a] 


The following result is fundamental for the problem of representing 
a matrix in the canonical form: if A has elementary divisors (106), 
there exists a matrix U with a non-zero determinant such that 


UAU-1 = [I (Ay), Lg, (4a), --- + Lop (Ap) ]- (109) 


We mention that finding U reduces to elementary algebraic op- 
erations if all the characteristic roots of A are known. If 0 = 1, I,(a) 
is understood to mean simply the number a. It can happen that, even 
with the existence of equal characteristic roots, all the elementary 
divisors (106) are simple, i.e. have the form 


(A—4); (A—A)3 «+03 (A—4,)- 
In this case the quasi-diagonal matrix 
[Gs Fe Gales Ee) 
reduces simply to the diagonal matrix [4,, ...,4,], and we have 


reduction of the matrix to the diagonal form. 

It must be pointed out that the matrix U appearing in (109) is not 
uniquely defined. In particular, if d is the magnitude of the determi- 
nant of U, we can replace 


Uby tv and U- by jau, 
ya 


in (109), and hence we can take the determinant of U in (109) as 
equal to unity. Our treatment of the general problem of reducing 
a matrix to the canonical form is limited to these remarks for the 
present, though we return to discuss it in a special note at the end 
of Part 2 of Vol. III. As already mentioned, a detailed treatment of the 
problem for a particular type of matrix will be found below. 

It is easily shown that the necessary and sufficient condition for a 
matrix to be reducible to the diagonal form is for the rank of the matrix 


28] UNITARY AND ORTHOGONAL TRANSFORMATIONS 105 


of the coefficients of system (105) to be equal to (n — yx), where py is 
the multiplicity of the root A, of the secular equation. When this 
condition is satisfied, system (105) defines jy, linearly independent 
vectors (1x, Cox, -- +» Unx) [14]. 


28. Unitary and orthogonal transformations. We shall make use in 
this and later sections of the concepts of scalar product and vector 
norm (length), introduced in [13]. We recall that the square of the 
norm (length) is defined by 


|x|? =(x,x) = SI P, (110) 


or, in the case of real components: 


This definition of norm is bound up with a definite choice of funda- 
mental vectors, i.e. of coordinate axes. We shall refer to the coordinate 
system in the above definition of norm as a normal or Cartesian system. 
Apart form vector length, we have defined the scalar product of two 
vectors by the formula 


(x, ¥) = 2G + Loh. + --- + UGPn- (111) 
In the case of real vectors, this expression takes the more symmetrical 
form 

(Ky) = BY. + Lofet --- + EpYp- 


It follows from (111) that the scalar product changes to its conjugate 
on changing the order of the vectors: 


(y,x) =(x,y)- (112) 


We have described two vectors as perpendicular or orthogonal when 
their scalar product is zero. 

In future, unless something is said specifically, we assume that a 
Cartesian coordinate system is in question. In view of this, the linear 
transformations corresponding to passage from one Cartesian system 
to another have a special significance. We know that there is a cor- 
responding linear transformation of components for every passage 
from one fundamental vector set to another. Let us take the trans- 


formation 
(Yrs 2+ 2 Yn) = U (ty. 2s Bn); (113) 


106 LINEAR TRANSFORMATIONS AND QUADRATIO FORMS [28 


where the original coordinate system is Cartesian. The necessary and 
sufficient condition for the new system to be Cartesian is for the 
length of a vector in the new system to be likewise given by the sum of 
the squares of the moduli of the components, i.e. 


lw2+---+lynF=|2/?+ ...+]2,). (114) 


We show that, with this, the value of a scalar product is given in the 
new system by an expression analogous to (111). For suppose we have 
the two vectors 


(ante) ond "(24 j 2. 0) 
in the original coordinate system, corresponding in the new system to 


¥ (Yi --- Yn) and y’ (yj, ---5Yn)- 


We form two new vectors: z=x-+ x’ and u=x- ix’, with compo- 
nents (a, + 2%) and (2, + ix,). Assuming condition (114) fulfilled, 
we have 


S (Ya + Ye) Ya + Ye) = Ps (&_ 1 Li) (Ze + I), 


whence, again by (114), we get finally: 


Seth + 0) = St + HF) (115,) 
since 
Sin = Sieh and Pe Ziel. 
Similarly: 


> (Yx + 246) (Yr — 1H) = > (2p, + tay) (Z_ — #2) 


k=1 


and consequently: 


2 Pru = 4 
D> (YaV — Ye Ye) = > (He Tye — XX). (115,) 
kt k=l 


Equations (115,) and (115,) give: 


Pie S23 (116) 


k=l 


i.e. the scalar product is in fact given by the previous formula. Hence, 


28) UNITARY AND ORTHOGONAL TRANSFORMATIONS 107 


if transformation (113) satisfies condition (114), it likewise satisfies 
(116), ie. the value of the scalar product remains invariant. Con- 
versely, (114) follows from (116), if we put a = a in (116), since the 
scalar product of two identical vectors obviously reduces to the square 
of the length of the vector. Linear transformations that satisfy con- 
dition (114) or (116) are generally called unitary. 

If we take real space and linear transformations with real matrices, 
condition (1]4) becomes simply 


Yir Yet TYn = Bt wet... top (117) 
and the corresponding rea] transformations are called orthogonal. 
They are evidently particular cases of unitary transformations. 

We shall now elucidate the special properties of unitary transfor- 
mations. We write conditions (114) for transformation (113) in the 
explicit form, the elements of U being denoted by ux: 


ni mn 
Stati tet tanta = Seu 


or 
n 


— si _ ice ua — 
a (Ua By + + Un By) (Uys By ++ + Un By) = SX Zy- (118) 
kel k=l 

On removing the brackets on the left-hand side and equating coeffi- 
cients of x) Z, to unity, and of z,%, (p # q) to zero, we have the 
necessary and sufficient condition for the elements of a unitary 
transformation in the following form: 


WS 


| &zp (2? =1 (p= 1,2,...,%), 


af 
Pa 


(119) 
> Up Mg = 9 (p #4), 
k=l 


i.e. the sum of the squares of the moduli of the elements of each 
column must be equal to unity and the sum of the products of element 
of one column with the conjugates of corresponding elements of an- 
other column must be zero. These conditions are sometimes written as 


n 
ete a = O93 (120) 


where dy are the elements of the unit matrix, i.e. 


0 (p # 9) : 
°oe = ( (p=9) vn 


108 { LINEAR TRANSFORMATIONS AND QUADEATIC FORMS [28 


We applied above to identity (118) the method of undetermined 
coefficients. This is sufficient, of course, for satisfying the identity. 
It is easy to show, by assigning particular values to z,, that identity 
of the coefficients of like terms is also necessary. 

We take determinants D(A) and D(A), the latter consisting of con- 
jugate elements of the former. On multiplying these column by column 
[6], we obtain by (119) the determinant of the unit matrix, i.e. unity. 
On the other hand, it is clear that both our determinants are expressed 
by complex conjugate numbers, and it follows at once from what has 
been said that 

|D(A)P=1, 
ie. the square of the modulus of the determinant of a unitary matrix is 
equal to unity. In other words, the determinant of a unitary matrix 
has a modulus of unity, i.e. is expressed by a complex number of the 
form e'*, where p is real. 

We introduce into the discussion U), the transpose of matrix U. 

Conditions (119), which are generally known as column orthogonality 
conditions, can be written in the matrix form 

UMU=TI, (122) 

which is equivalent to 
U2 =U =F, (123) 
i.e. if a matrix is unitary, its inverse is equal to its Hermitian conjugate. 

The transformation U-1, the inverse of U, expresses the passage 
from vectors y to x. This also clearly satisfies unitary condition (114), 
ie. if U is a unitary matrix, its inverse U— is also unitary. In other 
words, by (123), U is unitary, and its columns satisfy the orthogonality 
conditions. But the columns of U are the rows of U. We can thus say 
that the rows as well as the columns of a unitary matrix satisfy ortho- 
gonality conditions, i.e. we have in addition to (120): 


fi 
D> tpi Bae = 5 pq- (124) 
k=1 


Similarly to the above, if matrices U, and U, satisfy condition (114), 
their product U, U, also clearly satisfies this condition, i.e. the product 
of two unitary matrices is also unitary. 

We indicate two alternative forms of the definition of unitary 


matrix: 
| Ux|? =|x|? or (Ux, Ux’) = (x,x’), (125,) 


x and x’ in the second equation being arbitrary vectors. 


29] BUNIAKOWSEYS INEQUALITY 109 


We now consider the situation when a unitary matrix has real 
elements. As already mentioned, it is described as orthogonal in this 
case, the corresponding transformation being an orthogonal trans- 
formation. Here we have, instead of (120) and (124): 


nm n 
ave Ung = Spq 5 PA Ugn = Spq- (125.) 


Moreover the determinant of the transformation must certainly be 
a real number, so that its value can only be +1. These real orthogonal 
transformations in n-dimensional space are the complete analogue of 
the transformations of three-dimensional space that we discussed in 
[20]. In the real case, moreover, U coincides with U), i.e. the inverse 
transformation U-1 is got from U by replacing rows by columns. 

We mention further that every complex number e’”, where ¢ is real, 
regarded as the matrix [e’?, ee ds e’*], represents a unitary matrix, 
and if U is a unitary matrix, the product e’? U is likewise unitary. We 
explained in [25] the meaning of the product of a number with a 
matrix. 


29, Buniakowski’s inequality. In the present section we establish an 
inequality which will be useful later. We have already derived this 
inequality in Vol. II [II, 156]. It consists in the following: whatever 


the real numbers a,, a, .-.,@m and f,, fy, ..., Bm, we have: 
Le 2 m m 
[x oF < otk Bh (127) 


where we have the equals sign when and only when the a, and ; are 
proportional, i.e. 


Pa es ee he (127) 


Let é be any real number. We form the sum: 
m 
J= > (Fe = B,)?; 
k=1 
which is clearly > 0. We have the equality when and only when 


ay a, an 


*_ Vat. SB 
“J 
| = D> ae: S Pi. 


110 LINEAR TRANSFORMATIONS AND. QUADRATIC FORMS [29 


Generally speaking, on removing the brackets in J, we get the quadratic 
expression 
S=AP—2BE+C 


where 


The quadratic expression remains > 0 for all real £, whence it follows 
that AC — B? > 0, ie. B? < AC, which leads to inequality (126). 

If AC — B* = 0, the quadratic form must vanish for some real , 
whilst condition (127) must be fulfilled, as we saw. Conversely, if 
the condition is satisfied, we have the = sign in (126). Now let the a, 
and £, be complex numbers. Using the fact that the modulus of a sum 
is < the sum of the moduli of the terms, we get 


m 
> Oo, Br 
kal 


m 
<> lexl [Bel - 
k=l 


On applying inequality (126) to the last sum, consisting of positive 
terms, we get 


Px Xba < Ble: Sl fel. (126,) 


It is easily shown that in the present case, with complex a, and fy; 
the sign of equality occurs when and only when | a,| and | 8; | are 
proportional, and all the products a, £, have the same argument. 
Inequality (126) is applicable to integrals as well as sums, as already 
mentioned in [II, 156]. If f(z) and f,(x) are two real functions in the 
interval a < x < b, the inequality for integrals is 


® b 6 
| fA@h (2) del < (Ale)ae- ff (@) dx. (126,) 
We see this by forming the expression 


(EAM —h@Pde = 8 (Ade —28 [Wh wae + fAWar 


a 


where & is any real number. The form of the left-hand side implies 
that this cannot be negative for any real §. But if an expression of 
the form A& — 2£B + C is non-negative for all real £, we know from 
elementary algebra that AC — B® > 0. On applying this to the right- 
hand side above we get (126,). The inequality was first proved for 


30] PROPERTIES OF SCALAR PRODUCTS AND NOBMS ML 


integrals by V. L. Buniakowski. It was encountered by Cauchy for 
sums. 


30. Properties of scalar products and norms. We now mention 
some properties of scalar products and norms. On applying (126,) 
and taking into account that | J, | = | y, |, we can write: 


i ki — 2 i es 
(my P=| Sue) < Siu? Sly, 
Past ka4 Past 


i.e. 
is y)1<l=ll-lly!- (128) 
We now prove the so-called triangle rule: 
lz+yi< i=] + lyll- (129) 
We have: 


[z+y[P=@+yx+y=@x7yy+@y)+,%), 
so that we get by taking into account (128): 


Ix+ylP <xIP + lly? + 2k] ly l= (xl + ty? 


whence (129) follows. 

We conclude the present section by considering the effect of the 
choice of coordinate system on the metric of a space, i.e. on the 
expression for the square of the length of a vector. Suppose we choose 
a new system in place of the fundamental Cartesian system, with 
the fundamental set consisting of the independent vectors 


ZO), 2) 2. of), 
We shall have for any vector: 
x=2,29U +... 22,2), 


where the z, are its components in the new coordinate system. 
The square of the length of this vector will be given by its scalar 
product with itself, i.e. 


[x [P= (2,294... + 2,2, 2294 ...+ 2,20). 
On expanding in accordance with the formula previously given, we 


get the following expression for the square of the length of the vector: 


71 
[x= OS a 2,2, (130) 


i k=1 


112 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [31 


where the coefficients a; are given by 
ain — (z, zi) - 


These latter evidently take their conjugate values on interchanging 
the subscripts, i.e. 
Opi = ix- (131) 

A sum of the form (130), with coefficients satisfying condition (131), 
is generally referred to as an Hermitian form. It follows at once that 
every expression of type (130) with condition (131) has real values 
only, whatever the complex 2;,, since with 7# k a pair of terms of (130) 
will be conjugates, whilst in the case of terms of the form ax, | 2, |°, 
the a,, are real by (131). Furthermore, we can say from the method 
of obtaining Hermitian form (130) that it is not negative and only 
vanishes when all the z, vanish. Formula (130) in fact defines the 
metric of the space in the new coordinate system. 

Metric (130) will coincide with metric (110) in the corresponding 
Cartesian system if 


a,—=0 for :#k and a,=1 


(2, 2)=0 for i¢k and (2, 2)=1, 


or in other words, if the z™ that we have taken as the fundamental 
set are mutually orthogonal unit vectors (of unit length). 

A further point is that, if (113) defines a unitary transformation 
of vector components, the corresponding transformation for passing 
from the original to the new fundamental set is given by the matrix 


yw, 


the contragradient of U. This array coincides with U in the present 
case by (123), whilst it coincides simply with U in the case of real 
orthogonal transformations. 


31. Orthogonalization of vectors. Suppose we are given any m 
linearly independent vectors x“, ...,x°. A set of vectors 


C,x9 4. 4 Ox, 


where the C; are arbitrary constants, defines our total space ifm = n 
or an m-dimensiona] subspace #,, if m <n. We show that m mutually 
orthogonal unit vectors can always be constructed so as to form the 
same subspace #,, as the vectors x In other words, these new 


31] ORTHOGONALIZATION OF VECTORS 113 


orthogonal unit vectors 2“ must be expressible linearly in terms of 
the x, and conversely, the x“ must be expressible in terms of the 
z, We can construct these vectors in accordance with the following 


scheme: 


yO = x 
(2) — x@) _. (x@) 2) 2 
os (3) (3) ie (1) (3) _7(2)\ ,(%) (132) 
yO = x — (x, 2) 2 — (x@), 2) 2& 
where 
(3) y?) (m) 
(6) ee 2) ‘ : (m) __. _¥ 
i) Ga) cea 


The vector z™ is found from y“ simply by dividing by the length 
of y™ so that the length of z™ is unity. Next y® is constructed in 
accordance with the above formula, and its definition implies at 
once that it is orthogonal to 2“: 


(y®, zi) = (x®, z)) = (x®, z))) (2, zi) =0. 


On dividing y” by its own length, we get z. Next we construct 
y® in accordance with (132), and it follows directly from the defini- 
tion that it is orthogonal to z™ and 2°. 


For we have by the orthogonality of 2 and z®): 
(y®, 22) == (2), zi?) = (x®, 2?) (z®, z)) —0. 


Division of y by its own length gives us z®), and so on. 


All the newly constructed vectors are given linearly in terms of the 
x. It is easily seen that conversely the x“ are expressible linearly 
in terms of the 2“. We can do this simply by solving successively 
the above equations with respect to x x?) and so on. 

We notice also that none of the new y“ can be zero. For if we 
obtained some zero y“ at some step in the working, since this is 
given in terms of the x by a linear expression in which the coefficient 
of x™ is unity, we should now have linear dependence of certain of 
the x, which contradicts our hypothesis that these vectors are 
linearly independent. 

We recall that if pairs of a non-zero vector set are orthogonal, 
the vectors must be linearly independent. 

If m=n, the z™ yield a complete system of orthogonal unit 
vectors, forming a Cartesian system. But if m< x, a complete 
Cartesian system requires the addition to the z of a further (n — m) 
vectors, these latter being orthogonal both to each other and to 


114 LINEAR TRANSFORMATIONS AND QUADRATIO FORMS [32 


the z. These new unit vectors must thus form a subspace Ri,_m 
of (n—m) dimensions, orthogonal to the subspace #,, [12]. The new 
required vectors u must satisfy the system of equations 


(u, x) =0,..., (a, x(m)) =0. 


We have here a system of m homogeneous equations with n 
unknowns, the rank of the system being m, since the vectors x“? 
are linearly independent [12]. The system has (n — m) linearly 
independent solutions, i.e. we get (nx —m) linearly independent 
vectors. On applying the above process of orthogonalization to these 
and reducing their lengths to unity, we obtain our complete set of 
linearly independent unit vectors. 

We notice one further point. The subspace R, formed by the 
orthogonal unit vectors z“ can equally be formed by another system 
of orthogonal unit vectors. We see this simply by applying a unitary 
transformation to the system of z. Thus the process of orthogonaliza- 
tion can be carried out in different ways, and the method indicated 
above is only one of the possibilities. 


32. Transformation of a quadratic form to a sum of squares. We take 
a second order surface in space with its centre at the coordinate 
origin: 
Ag? + By? + C2? + 2Dxy + 2Haz + 2Fyz+G=0. 
New axes (x’, y’, 2’) can always be chosen such that only terms 
containing squares of the coordinates remain in the transformed 
equation, i.e. such that the transformed equation has the form 


A, 2? + Agy?® + Agz727 +40. 
The problem amounts to finding the orthogonal transformation 
relating variables (x’, y’, z’) and (z, y, 2), so that the set of second 
degree coordinate terms on the left-hand side of the equation reduces 


to a sum of squares. We formulate the analogous problem for real 
n-dimensional space. Let us have the real quadratic form in 7 variables: 


n 
P(%is --- 9 T= > Vin ZX, (134) 
I; kel 
where the a,;, are real coefficients satisfying the condition 
Oyj = Viz: (135) 
We can take in the previous example, s = z,, y = %, z = 2, and 
Qy, = A, dy = B, sy = C, hy = dy, = Di hs = Ay = E,y3 = Oy = F. 


32] TRANSFORMATION OF A QUADBATIC FORM TO 4 SUM OF SQUABES 115 


The matrix composed of elements a; is known as the matrix of 
the quadratic form (134). This is a symmetric matrix, i.e. the same 
as its transpose. 

Suppose we transform (134) to new variables z; instead of 2, the 
transformation being written as 


(i033 Ey) = Bp sh), (136) 
where B is a matrix with elements };,. On substituting in (134) from 
(136), we get 


n 
Pe Giz (bi Hy + was + bin th) (bx 4 + eects ++ Bin Zn): 


Removal of the brackets gives us for the coefficient of x, 2 with 
DFG: 
n 
= ix ip big 8 ig Bip) 
i, k=1 


It is easily seen by using (135) that half the last expression is 
simply equal to the sum: 


n n 

ia 7 
> Pip > Fix ng - 

i=k k=1 


Hence, on dealing similarly with the case p = g, we see that the 
quadratic form becomes in the new variables: 


ii Cin Ti Th, (137) 
i 
where 


n 
C= Cu = > by dad sk 
t=1 s=l 
Summation over s gives {4B};,. If we take ¢ as indicating the 
column and ¢ as indicating the row in the factor },;,, 5,; will be the 
element {B},, of the transpose, whence 


n 
Cy HO = > {BP}u {AB}, 
t=1 
i.e. the transformed form (137) has a matrix given in terms of the 
matrix A of the form in the original variables and the matrix B of 


transformation (136) by 
C= B* AB. (138) 


116 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [32 


If transformation (136) is orthogonal, the transpose B® of the 
orthogonal matrix B is the same as the inverse B-1, and we have in 
this case, instead of (138): 

C=B IAB. (139) 


Our task of finding an orthogonal transformation (136) reducing 
the quadratic form (134) to a sum of squares is thus equivalent to 
the task of finding an orthogonal matrix B such that matrix (139) 
is simply a diagonal matrix [/,, ...,4,], inasmuch as, if a form 
reduces to a sum of squares, its matrix is in fact a diagonal matrix 
whose elements 7, are the coefficients of the 2/2. We must there- 
fore have, as previouslv: 

BO AB=[A,«<3,4,] 
or 
AB=B[A,, ---.4,]. (140) 

We remark that the A here is real and symmetric, and not an 
arbitrary matrix, whilst B must be orthogonal. We shall proceed 
precisely as in [27], when considering the general case. We re-write 
(140) as 

n 
D> Gis bgy = Ay Bix. (141) 


$=1 


We thus have z equations for the elements of the kth column of B. 


On introducing the vector x“ with components (by, ..-, Drax), Wwe 
can write the last equation as 
Ax) = 1, x, (142) 
On taking all the terms of (141) to one side, we have a system of 
nm homogeneous equations for Dy,, ..., Bax: 


(yy — Ay) Dyy + Az Bax +--+ + Gay On, = 0 
Gq; Oy, + (Aaa — Ay) Bay, + --. + Fond, = 0 


Gn Diy + Ong boy ~~~ + (Ban — Ax) Onn = 0 


(143) 


The determinant of this system must vanish, and we get an algebraic 
equation of degree n for the numbers 4,: 


Ay — 4, Ay, +++) Oy 


Ge): Ox, — A, peewee Aon, 0. (144) 


32] TRANSFORMATION OF A QUADBATIO FORM TO 4 SUM OF SQUARES 117 


As we know, this is the characteristic equation of matrix A. 

We show first of all that all the roots of equation (144) are real 
for a real symmetric matrix 4. We start by indicating a new way 
of writing a quadratic form. Let x be a vector with real or complex 
components (Z,,.--,2,) and A a matrix with any elements aj. 
We form the scalar product: 


(Ax,x) = S2(a, 2, +... +4in2,)- 
isl 


It can evidently be written in the form: 


n 
(Ax, x)= DS Gx 2,2. (145) 
k=l 
If the condition 
G,,; = Gy, (A, — real) , (146) 


is satisfied, we have an Hermitian form, the value of which is necessarily 
real. The case of real symmetric A is a particular case of condition 
(146). Lf in addition the components of x are real, (145) in fact yields 
the quadratic form (134). 

We now turn to the proof that the roots of (144) are real. Let 4, 
be a root of the equation. System (143) now gives us the the com- 
ponents of a vector x“ (real or complex) which satisfies equation 
(142). We take the scalar product on the right with x of both sides 
of this equation, and get: 

| x 2 je — (Ax, x) 3 


The expression on the right is a real number, as we have seen, 
and consequently 2, is also a real number. We have thus proved 
that the roots of (144) are real not only for a real symmetric matrix 
but also for any matrix whose elements satisfy condition (146). 
These latter matrices are generally described as Hermitian. 

The coefficients of system (143) are real numbers in the present 
case, and we can take the components of x“? as real. We now show 
that if 4, and 4, are two different roots of (144), the corresponding 
x” and x satisfying equation (142) are mutually orthogonal. 
We have by hypothesis: 

Ax®) — A, x”); Ax@ — Ay x. 


On forming the scalar product of the first of these equations with 
x, and of the second with x, and subtracting, we get: 


(Ax®), x@) — (x®, Ax@) — (A, oo 4.) (x), x), (147) 


118 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [33 


We now show that, for any two (real or complex) vectors x and y, 
we have 
(Ax, y) = (x, Ay). (148) 


provided only that the elements of matrix A satisfy condition (146). 
The left-hand side of (148) gives, in fact: 


fn 
(Ax, y) = (GiB +--+. + en By) x = > By Ti GYx » 
k=1 i, k=1 
or, by (146): 
(Ax, y) = > Bin BiG, - 


t,k=1 

The right-hand side of (148) yields the same result. Our formula 
roust likewise be valid for real orthogonal matrices, since these are 
a particular case of Hermitian matrices. By (148), the left-hand 
side of (147) is zero, and since J, # 4, we have (x, x) = 0, 
ie. vectors x™ and x are in fact orthogonal. These vectors are 
real in the present case, and their orthogonality amounts to the 
condition that the sum of the products of their components is zero. 

Thus if (144) has different roots, we have n mutually orthogonal 
real vectors x“, Equation (142) is linear and homogeneous in x“), 
and we can therefore multiply a solution of it by an arbitrary constant. 
It follows that we can take the lengths of the x“ as unity. 

The components of these vectors form the columns of matrix B. 
In other words, B satisfies the condition for orthogonality with respect 
to columns, and is an orthogonal matrix. Hence our task of reducing 
a quadratic form to a sum of squares by means of an orthogonal 
transformation — or what amounts to the same thing, of reducing 
a matrix A to a diagonal form — has been solved, provided we assume 
that equation (144) bas n different roots. The numbers 4, are known 
as the eigenvalues of matrix A, whilst the x are the eigenvectors of 
the matriz. 


33. The case of multiple roots of the characteristic equation. We now 
take the general case, when equation (144) can have multiple roots. 
We find the solution of (142) corresponding to a given root 2 = A, 
of (144). The solution will be a real vector x of unit length. We 
associate with it a further (x — 1) real unit vectors, so that altogether 
a complete system of orthogonal unit vectors is formed [31]. The 
passage from the old to the new fundamental] set will be expressed 
ag usual by an orthogonal transformation of the vector components, 


33] THE CASE OF MULTIPLE ROOTS OF THE CHABACTERISTIC EQUATION 119 


and A becomes the similar matrix A, = By* AB,. The equation 


A,x = Ix (149) 


corresponding to the new matrix, will have x™ ss the solution 


corresponding to the eigenvalue 4 = 4, (the eigenvalues are invariants 
in a similarity transformation), where x” is the vector we took as the 
first of the fundamental set, so that its components are (1, 0, ..., 0). 
On substituting this solution in (149), we get: 

A,(1, 0, ...,0) = (4,0, ..., 0), 
so that we have at once for the elements of the first column: 


{A}ia = Ay; {Ay}or = {Arle Sri {Ayn =0. (150) 
We now show that the real matrix A, is also symmetric, ie. is 
the same as its transpose. In fact: 


A® = (By AB) = BH AM BOT, 
But since 2, is orthogonal: 
BO = Br) and BY = B,, 
whence it follows that 
AM = A,. 


On taking (150) into account, together with the symmetry of A,, 
we can write: e 


{Ai} =A; {Ay} _ {A, r=: = {Ai}n = 14ib in =0, 
i.e. all the elements of the first row and first column of A, vanish 
with the exception, of course, of {4,}, = 4,, ie. A, has the form 


[Py Oy OO 
ee | 
aa 0, ss a | 


where we have written a? for the elements of A). 
The quadratic form gy becomes in the new variables: 


p=Ayt + > yy. 


i, k=2 


We have thus isolated one square and now have to consider the 
quadratic form in (rn — 1) variables 


it 
> @® gi i 


ivn=2 


120 LINEAR TRANSFORMATIONS AND QUADRATIO FORMS [33 


or what is the same thing, we have to consider the (n—1)th order 
matrix C, corresponding to this and forming part of A,. We can 
argue here exactly as above and find a unit vector x in the (n — 1) 
dimensional subspace formed by the remaining (n — 1) fundamental 
vectors such that x® is a solution of the equation 


C, x2) — Ay x(2) 


This vector is clearly orthogonal to x“. The fundamental vector 
x ig preserved after this second transformation, whilst the remaining 
fundamental vectors become a new mutually orthogonal unit set, 
the first of which is x®. The quadratic form g becomes in the new 
variables 


n 
P=hyPrhyPet SF oP yi ye. 


i, k=3 


By proceeding thus, we finally reduce the quadratic form to a sum 
of squares, i.e. we reduce the corresponding matrix to a diagonal form. 
We do this as a result of applying a series of orthogonal transforma- 
tions, and this is clearly equivalent to the application of the single 
orthogonal transformation B equal to their product. 

The final diagonal matrix 


Bo AB=[A,..., ag] (151) 


is similar to the original matrix A, and consequently its characteristic 
equation 


is the same as (144), in other words, the coefficients 7, of the squares 
in the reduced quadratic form 


p=Aeezt... +4,22 (152) 


are the roots of (144), each multiple root being repeated as many 
times as its multiplicity. 

As we know, each column of the final orthogonal transformation B 
yields a vector which is a solution of (142), where the method of 
derivation implies that the A, corresponding to a given column is the 
same as the coefficient of the corresponding variable in quadratic 


33] THE CASE OF MULTIPLE ROOTS OF THE CHARACTERISTIC EQUATION 121 


form (152). The situation may be indicated more precisely. By (136), 
the orthogonal transformation B, which satisfies condition (140), 
transforms the variables (a4, ..., 2) t0 (2, ..., &p)- 

The inverse B-) is the transpose of B, i.e. we have 


t= Oy,% +... +b,2, (F=1,2,..., 2) (153) 


and x“), with components (By, -.-, dnx), is @ solution of (142) with 
A= Ax. 

We finally show that we have found all the solutions of (142). 
First of all, it follows from the above arguments that 2, must be a 
solution of (144). Let 2 be any root of this equation; we suppose 
for clarity that it has multiplicity three, whilst it is obviously possible 
to take 4 = 4, = 4, = Ay. The equation 


Ax=i,x (154) 
now has, by the previous discussion, the three solutions: 


x) (Oi, -- 55 Baa); xi) (Big, «+++ Bye); x) (b13, «+ +5 Dns) - 


We show that every solution of (154) must be a linear combination 
of these. If this were not so, we should in fact have a solution y 
linearly independent of x“, x, x®, Our y could be complex, but in 
this case its real and imaginary parts have to satisfy (154) individually, 
since the coefficients of the equation are real. At least one of these 
parts must obviously be a vector linearly independent of the x“ 
(% = 1, 2, 3). Hence we can suppose that our y is real. As shown 
above, it must be orthogonal to all the x with k > 3, since the Ax 
corresponding to these are different from 4,. The result is that y is 
linearly independent of the total set of x“, ie. we have (n + 1) 
linearly independent vectors, which is impossible. Every m-tuple 
root of (144) thus implies precisely m linearly independent solutions 
of (154). 

Substitution of an m-tuple root 4 = /, for 4, in the coefficients of 
system (143) gives us a homogeneous system with m linearly inde- 
pendent solutions, i.e. the rank of the system must be (nz — m). 
In other words, the system reduces to (n — m) equations. We take 
any solution of the system and multiply it by a factor such that the 
sum of the squares of the numbers appearing in the solution is unity. 
This gives us one vector corresponding to the root 4 = A,. To find 
the next vector, we add to the (x — m) equations of our system a 


122 LINEAR TRANSFORMATION AND QUADRATIC FORMS [33 


further equation expressing the orthogonality of the required vector 
to that already obtained. We thus have a homogeneous system of 
(n — m+ 1) equations for the components of the new vector. On tak- 
ing any solution of this system and normalizing again (reducing the 
vector length to unity), we are faced with the task of finding a third 
vector corresponding to the root 2 = 2,. We do this by adding to 
the original (x — m) equations a further two equations, expressing 
the orthogonality of the new required vector to the two already 
found, and so on until we arrive at the total set of m mutually orthogo- 
nal unit vectors corresponding to the m-tuple root 4 = A). A direct 
consequence of this method of construction is a certain arbitrariness 
in constructing the basic solutions of equations (142). If all the roots 
of the equation are simple, this arbitrariness merely amounts to the 
possibility of multiplying all the components of x by (—1). Now 
let (144) have an m-tuple root. In this case the corresponding m ortho- 
gonal unit vectors making up the solution of (142) form an m-dimen- 
sional subspace R,,. We can obviously make an arbitrary choice of 
mutually orthogonal fundamental vectors in this subspace, and they 
will likewise be solutions of (142) with 2 = A), ie. we can pass from 
one set of orthogonal normalized solutions to another by carrying 
out an orthogonal transformation of #,,. All these remarks equally 
apply to any other multiple root of (144). 

What has been said may be explained by returning to the problem 
treated at the start of the previous section, of reducing the equation 
of a second order surface to axes of symmetry. Suppose for definiteness 
that the surface is an ellipsoid. The case of different roots of (144) 
corresponds to the fact that all the semi-axes of the ellipsoid are 
different. In this case the natural arbitrariness in the choice of final 
coordinate axes amounts to a change in the direction of these axes. 
If (144), which is here an equation of the third degree, has two equal 
roots, the ellipsoid becomes an ellipsoid of revolution, and two axes 
of symmetry can lie where we please in the plane passing through 
the centre and perpendicular to the axis of revolution, provided 
only that they are perpendicular to each other, i.e. in the present 
case the arbitrariness in the choice of final axes consists further in 
an arbitrary orthogonal transformation in the above-mentioned 
plane. Finally, if all three roots of (144) are equal, our ellipsoid is a 
sphere, and our equation does not contain coordinate product terms. 
Here our choice of Cartesian axes in space is in general completely 
arbitrary. 


34] EXAMPLES 123 


34, Examples. We shall consider two numerical examples. 
1. To reduce the surface given by 
Ti + Sag + eh + Qe, re + Gr, z, + 2a, 7, = 5 
to axes of symmetry. 
The corresponding quadratic form will be 


p= at + 2,2, + 82,254 
+ 2,2, + 5e3 + 727, + 
+ 322, + 137,425. 
The characteristic equation of its matrix is 
1-4 1, 3 
1, 5-4 1 
3, 1 I-A 


whence, on expanding by the first row: 


=0, 


ad—a(6—-—4 ad —-—4-—-1-(—-—A—3) + 38[1-—3 6-4] =0 
= 43 — 7/2 +- 36 =0. 
It is easily verified that the roots of this equation are 
A,=—2; 4,53; 4,=6, 
and the equation of our surface, referred to axes of symmetry, is 
— 2xi2 + 3a32 + Gry 5. 
We shall now find the elements of the orthogonal matrix: 


bu, bie, by, 
bay bez, beg 


Deir Bag» Bag 


B= 


i 
We have the system for these, 


(L — 4) big + bay + 304, = 0 | 
bi + (5 — A) bey + By, =O 
Bday + bey + (1 — 4) By, =O. J 


We first substitute 4 = 4, = —2, which leads us to two equations: 
30:,+ 52, + 3b, =0 
by Ty + by = 0. 
The solution of this system has the form 
bi = — hy; b= 0; bn =, 


(155) 


where k, is an arbitrary number. We choose it so that the sum of the squares 
of the numbers making up the solution is equal to unity. We finally have 


1 1 
aa ae by = 0; by, = — —— 


where the solution can be taken with the opposite sign. 


[34 


124 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS 
We now substitute 4 = 4, = 3 in the coefficients of system (155). We get 
@ system in which the third equation is the difference between the first two, so 


that it reduces to two equations: 
— Wye + dee + 3by, = 0 
Dye + 2boy + bye = 0. 


The solution of this system, normalized to unity, is easily found: 

1 1 1 

6 =—— 5 bs, = — —=} dae = —— 
ny Fe 

We finally substitute the third root in the coefficients of (155). Again we 

get a system in which one equation is a consequence of the other two. On 
solving the two remaining equations and normalizing to unity, we have: 

1 


2 
ae) = 
33 6 


bag tan ag 
The transformations of the variables are given in the present case by 
= ie = z= 7 
Reppa teh tien 


2. To reduce to axes of symmetry the surface 
22}? + Ga? + 222 + 82,2,—=1. 


Here the quadratic form is written as 

p = Pry + On, te > 4a, 3+ 
+ 0z,2,-+ 623 + 02,2, 
+ 42,2, + 07,2, + 223, 


The characteristic equation of its matrix is 
2—4, 0, 4 
6—4 0 =0. 


0, 


4, 0, 2-2 


Expansion of the determinant gives us the equation: 
43 — 102? + 1244-72 =0. 
A=—2, 4,=4,=6, 


The roots are 


i.e. the equation has a double root. 


35] CLASSIFICATION OF QUADRATIO FORMS 125 
Next we find the coefficients of the orthogonal transformation, for which 
we have the system 
(2—A)by, + 4by,=0 
(6 — 4) dy =0 (155,) 
4b, +(2—A4)b,=—0. 
On substitutitng 4 = —2, we easily arrive at the normalized solution 
1 1 
—; b,=0; &=—=—-—. 
y2 c1 st ¥2 


We now substitute the double root A= 6 in the coefficients of system 
(155,), for which we must get two linearly independent and mutually orthogonal 
solutions. The substitution leads us to the single equation 


— b,, + 5,, = 0. 
We take the normalized solution of this equation: 
nn Saaeereeee 
2 Ve ee = 0; bag = 2° 


As regards the second solution, we notice that it has to satisfy both system 
(155,) and the condition for orthogonality with the solution already obtained. 
This gives us two equations for it: 

—by+t b,=0 
1 


1 
—b — b,, =0 


b,,=b,,=0, 
whence the normalized solution becomes 
bs =0; 5,=1; 5,,=—0. 
Finally, the orthogonal transformation will be 


bu _ 


or 


2} a be 
Ls — —_ 

2 y2° 
I 


and the surface has the equation, referred to axes of symmetry: 
— aj? + 6 (ag? + 3%) = 1. 
35. Classification of quadratic forms. The problem of reducing a 


quadratic form to a sum of squares can be posed in a more general 
form to that given above, where we have required orthogonality of 


126 LINEAR TRANSFORMATIONS AND QUADBATIC FORMS [35 


the linear transformation from the new variables to the old. We can 
take the following more general problem: to reduce the real quadratic 
form (134) to the form 


P= My XT H+ MeXZ+ .-. + oy Xi, (156) 


where the X, are 2 linearly independent real linear forms in the 
variables z,. The coefficients uw, in this statement of the problem 
are not definite numbers such as we had above, though we can say 
something about them, viz., the number of non-zero yu, must always 
be equal to the rank of the matrix composed of the coefficients a;, 
of the quadratic form. In other words, in any reduction of a quadratic 
form to a sum of squares of linearly independent linear forms, the 
number of the squares is equal to the rank of the matrix just mentioned. 
In addition to this, a further property holds, which is usually known 
as the law of inertia of quadratic forms: in any transformation of a 
real quadratic form to the form (156), where the linear forms X;,, 
are also real, the number of positive coefficients p, (and the number 
of negative coefficients py, ) is always the same. We shall prove these 
assertions at the end of the present section. 


This general problem of reducing a quadratic form to form (156) is always 
very easily solved on separating out perfect squares. We shall do this in the 
particular example: 


p = 22 4 4a} + 23 + 2e,x, — bay2, + Brey. 


We obtain a perfect square from (xz? + 2z,2, — 6z,25) by adding (x? + 922 — 62,2,), 
when we can write ¢ in the form 


y = (2 + 2, ~ 3z,)* + Br} — 823 + ldzz,. 
Wo separate out a further perfect square in the same way, and can finally 


write our quadratic form in form (156): 


7 2 6%3 
p= (@ +a, — 32, ~2(22, — > 2.) + er. 


The linear forms appearing in the brackets are clearly linearly independent. 
The working is somewhat different if squares of the variables are absent 
in the expression for y. Suppose we have 
py = az,2, + Pr, + Qr,+ BR, 


where @ is a numerical coefficient differing from zero, P and Q are linear forms 
of variables, not including z, and z,, and # is a quadratic form which likewise 
does not include z, and z,. We can write 


p=a(a +2) (.+2)+2-*2. 


35) CLASSIFICATION OF QUADRATIO FORMS 127 


If we set 
X= y(a+e+-= H X,=5(a-2-==4) 
and 
—~R—?e 
A= a? 
we get 


p= aX} —aXt+ ey, 


where 9, is a quadratic form which does not include z, and z,. By separating 
out these two squares, we have got rid of two variables. 


The reduction of a quadratic form to form (156) makes it possible 
to classify the form as follows: 

I. Let all the coefficients y, in (156) be positive. In this case the 
form is said to be positive definite. It may easily be seen to have 
positive values for all real 2, and to vanish only when all the x, vanish. 
For, since all the yp, are positive, the necessary and sufficient con- 
dition for the vanishing of the right-hand side of (156) is that all 
the linear forms 2, vanish. We thus get a system of n homogeneous 
equations for the xz, with a non-zero determinant (the forms are 
linearly independent), so that only the zero solution exists. 

II. If all the yp, are negative, the quadratic form is said to be 
negative definite. As above, it can be seen to have only negative 
values for all real x, and to vanish only when all the 2, vanish. 

III. We now take the case when some of the 4, vanish, though 
all the remainder have the same sign, say positive. Our form ¢ now 
becomes 


P= MXit --- + UmXin (m<n), (156) 


where all the y; are positive. Here again the form cannot be negative 
whatever the values of the x;, though it can vanish for non-zero 2,. 
For if we want to find the zeros of the form, we have to write a 
system of m homogeneous equations in 2z;: 


X,=X,=...=X_,=0, 


and since m<n, this system certainly has non-zero solutions. 
Similarly, if all the y, are negative in (156,), the quadratic form cannot 
take positive values, though it can vanish for non-zero z,. Here the 
form is said to be positive or negative semi-definite. 

IV. Finally, if we get both positive and negative coefficients p, 
in (156), the form may easily be shown to take both positive and 
negative values for real 2. It is described as aliernating in this case. 


128 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS (35 


The above classification of real quadratic forms has an immediate 
application to the problem of the maxima and minima of functions 
of several variables. Let us take the function of » independent 
variables 2, ..., Zn: 


P(t ---+Fn)s 


and let the necessary conditions for maxima and minima be satisfied 
for 2, = ... = 2, = 0, i. all the partial derivatives of » with respect 
to the independent variables vanish at the origin. We have on 
expanding our function in a Maclaurin series: 


p(y, --- 5%) — P(0,..., 0) = pa, -.-,2,) + @, 


where we have written 9(2,, -.-,%) for the quadratic form in the 
variables z,, and w for the set of terms of higher order than the second 
in the z,. If the quadratic form 9 is positive definite, we have a 
minimum of the function at 7, =... =2,=—0. Lf it is negative 
definite, we have a maximum. If it is alternating, we neither have 
@ maximum nor a minimum, and finally, if p is positive or negative 
semi-definite, we have a doubtful case. This result is the natural com- 
éplement of that obtained in [I, 133] for functions of two independent 
variables. 


We turn to the proof of the statements made at the beginning of the present 
section. Let us take the quadratic form 


nm 
‘oan 2 Ail iT p (Ain = %i)s 
the rank of the matrix of its coefficients being r. We compose the system of 
# linear forms: 


n 
tae 22 Gat 2t (B= 1,2 00). (157) 
We have used the conditions a, = a,; in forming the expressions for these 
partial derivatives. Obviously r is the rank of system (157) in the sense described 
in [11]. 
Suppose that ¢ is reduced to the sum of the squares of m linearly independent 

forms ¥,: 

Ys = Bot, + Beek, + --- + Bortn (158) 
i.e. 

P= myi + Mes + --- + myn» (159) 
where the “4, differ from zero. We have to show that m =r. We use (159) to 
obtain the linear forms (157): 


1 8 
ZT Bag MBit + eaBaats + e+ + MnP atin (8 = 12s eos) (15%) 


35] CLASSIFICATION OF QUADRATIC FORMS 129 


The variables y, can take any values since forms (158) are linearly independent. 
Hence by definition of the linear dependence of forms (157,) the y, can be 
taken as independent variables, and the greetest number of linearly independent 
forms in system (157,) must be equal to the rank of the matrix of coefficients 
L By, Where the column subscript & takes all the values & = 1, 2,...,m, 
and the row subscript 7 all the values 7 = 1, 2, ...,”. The elements of each 
column of the matrix have the common factor yz; which is non-zero, and hence 
the rank of the matrix of yu, f,; is the same as the rank of the matrix of £,,;. Since 
(158) is a system of m linearly independent forms, this rank is m, i.e. the greatest 
number of linearly independent forms in system (157,) or (157) is m. On the other 
hand, this number is 7 by hypothesis, whence it follows that m = r. 

We now show that the number of positive (and negative) coefficients yu, 
is always the same, whatever the method of writing y as an expression of the 
type (159), where the y, are real linearly independent forms, We shall assume 
the opposite and prove a contradiction. Thus let y be expressed by two formulae 
of type (159) in which the number of positive coefficients is not the same: 


P= Ay? +... + Anup == Appl ray as milfms } (160) 
p= Ahy'E +... b Apy’p — Ageiy’gea — --- —Amym- 
The A, and 4, in these expressions are assumed positive. The forms y,, ..-, ¥m 


are linearly independent, and tbe same can be said of ¥\> «+-) Ym Since p # », 
we can always take say p < g. We show that this leads to an absurdity. We asso- 
ciate the forms ¥,,:1, .-., Yn, With y,, .--, Ym, 80 as to obtain a complete system 
of linearly independent forms [11]. We write down the system of linear homo- 
geneous equations for 21, Za, ..., Zn! 


¥,=0;..-5 Yyp=0; Ypi1 =0;-.-3 ym=0; Ym =—0;...5 yyz=O. (161) 
The number of these homogeneous equations is 


pt (m—g) + (n—m)=n— (q—p), 
and, since g — p > 0, this number is less than mn. Consequently the system 
written has real non-zero solutions. We take any one of the solutions: 2, = 
=o (¢ = 1, 2, ...,”). With these values of z,, we have by (161): 


: ieee Appin — 1. —Anyin = Ay’2 + --- ++ Agy’G- 
It is clear from this that p must vanish for z, = 2, and these z, must therefore 
satisfy, in addition to equations (161): 
Ypr1 = 05 6. 5 Ym =~ 


We see finally that the complete system of linearly independent forms, 4, 
- +) Yn» aust vanish for z, = 2. But this is impossible, inasmuch as the linear 
independence of the forms y, implies the non-vanishing of the determinant 
of the system 
Yi= 0; y2=05 ---5 p=, 


which is homogeneous in 2, %, ...,Zp,. We have proved the law of inertia by 
arriving at this contradiction. 


130 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [36 


36. Jacobi’s formula. Jacobi’s formula, which offers a convenient form of 
the reduction of a quadratic form to a sum of squares, will be stated without 
proof. 

We first introduce the following notation: 


n 
A; (z) = 21 tue (@=1,2,-...,7), 


G14, Ae, » Up 
4=1; 4,=ay; Ap=ji Ors» > sk | 
a6, Seles ety eee 
Any, An, AEE | 
(k —= 2, 3, , n) 
| Gu» sess Oper, Ay (2)! 


ee 


If the rank of the matrix of a,,is7, and the determinants 4,, 4,, ..., 4, do not 
vanish, Jacobi’s formula becomes: 
D bee 
— Q,2;2, = >=—T > 162 
eT ae we A Aga a 


where the linear forms X;, (k = 1,2,...,7) are linearly independent. The 
formula makes it possible to see from the signs of the A; to what class y belongs 
as regards the law of inertia. 

In particular, if all the determinants 4), 4,, ..., 4, are positive (with this, 
r =n), it follows from (162) that 9 is positive definite. The converse can be 
shown: if g is @ positive definite form, all the determinants must be positive. 
The variables z, can be enumerated in any order, of course, when applying 
(162). The 4, naturally also change on changing the enumeration, and each 
of the principal minors of the matrix | a, R can be a determinant of the sequence 
4, for a given enumeration of the ,. It follows from what has been said above 
that all the principal minors are positive in the case of a positive definite 
form 9, but it is sufficient here to verify the positiveness of the determinants 


It can be shown that the necessary and sufficient condition for p to be positive 
(of constant sign) is for all the principal minors to be non-negative, i.e. they 
can be greater than or equal to zero. It is not sufficient in this case to find the 
signs of the determinants 4, only, and the signs of all the principal minors have 
to be determined. 


37] THE SIMULTANEOUS REDUCTION OF TWO QUADRATIC FORMS 131 


A proof of the statements of this section may be found in Ostsilyatsionnie 
matrits: «4 malie kolebaniya mechanicheskich system (‘Oscillation matrices 
and the small vibrations of mechanical systems”) by F. R. Gantmaker and 
M. G. Krein (1941). 


37. The simultaneous reduction of two quadratic forms to sums of 
squares, Suppose we have the two quadratic forms 


n na 
. * — =) 
Fi = D> Vid Uy; PP. > Dink itty. 
i, k=1 i, k=1 


where @, is positive definite, i.e. reduces to the sum of x positive 
squares. We require to find a linear transformation (not necessarily 
orthogonal) such that both forms are reduced to sums of squares. 

We first of all introduce new variables y, such that gy, reduces 
to a sum of squares. This can be done say by the elementary method 
indicated in the previous section. Our forms will become in the 
new variables: 


n nr 
A= Seis a= > bY Yr 
k=1 


i, k=1 


All the yu, are positive by hypothesis, and we can bring in new 
real variables z, = (ux Yx. Now we have: 


Pri n 
2. a a ” 
Fi= D> %i Fo oY Cineiz- 
k=] i, k=1 


We carry out an orthogonal transformation of the z, to new variables 
24, such that g, reduces to a sum of squares. 


Since the transformation is orthogonal here, g, remains a sum 
of squares, and we have finally reduced both forms to sums of squares: 


i n 
Syl. 92 
= = Size. 
Pi Pe Pe = WE 


The 4, are sometimes called the characteristic roots of form gp, with 
respect to form ¢,. 

We now establish the equation that has to be satisfied by these 4,, 
and which is completely analogous to equation (144) of [382]. For 
this, we introduce the discriminant of a quadratic form, defined as 
follows: the discriminant of a quadratic form is the determinant made 
up of its coefficients. 


132 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [37 


Suppose we transform ¢, the matrix of the coefficients of which 
is A, to new variables with the aid of the transformation 


(a4)... , ,) = B(zt,..., 21). 
As we know from [32], the matrix of the new form is 
C= B® AB, 
and its determinant is given by 
D(C) = D (B®) D(A) D(B). 


The determinants D(B“) and D(B) are clearly equal since the 
corresponding matrices are obtained from each other by interchanging 
rows with columns. We thus have 


D(C) = D(A) D(BYy, 


i.e. on linearly transforming the variables in a quadratic form its 
discriminant is multiplied by the square of the determinant of the 
linear transformation. 

We now return to our quadratic forms ¢,, g, and consider the form 


nr 
wo =p, — Ag, = = (Big — Adi.) Hy, 
i, k=1 
the coefficients of which contain the parameter A. 
After transformation to the new variables, this form becomes 


o= Sh Azz, 


k=l 


and its discriminant in the new variables is evidently given by the 
product 


(A, — 4) (A, — A)... (An — A), (163) 


whilst the discriminant in the old variables is equal to the determinant 
with elements (bj, — 4 jx). As we have shown, these two discriminants 
differ only by a factor, viz., the square of the determinant of the 
transformation, which neither vanishes nor contains /. It follows at 
once from this that both discriminants have the same roots with 


38] SMALL VIBRATIONS 133 


respect to the parameter 4. On taking into account (162,), we see 
that the numbers 4, are the roots of the equation 


| by, — Ady, Byg — Ady, «~~, Big — Aayp | 


boy is Aan, bo» > Jaga; as «On aa gr, | =0Q. (163,) 


28. Small vibrations. We saw above [IT, 19] that the motion of a mechanical 
system with n degrees of freedom, the corstraints of which do not contain 
time and which finds itself under the action of conservative forces, is given by the 
system of differential equations: 


d (or er ew 
Oe SV ce eS, eT OS ole 164 
(a Gq, = "9x : a oF 


where 7’ is the kinetic energy of the system and U the given force function of 
the g,;, which we take to be independent of ¢. As was mentioned previously, 
T is a quadratic form of the derivatives q, of q, with respect to time: 


T= ¥ on 9% (Axi = Gix)s (165) 


i,k=1 


the coefficients being given functions of the g,. Suppose we have 


ou 
a for gq, =.--=gq,=0 (k = 1, 2,...,7). (166) 
i. e. the partial derivatives of U vanish for q, = 0. 
With this, system (164) has the obvious solution g, = ... = g, = 0, cor- 
responding to 4 position of equilibrium. The function J is defined except for 
& constant, and we can always suppose that it vanishes for g, = ... =], = 0. 


We can therefore say, in view of (166), that the expansion of U in powers of 
4, starts with a second order term. Let the quadratic form obtained from these 
second order terms be negative definite, whence it follows that U has a 


maximum for g, = ... =q, =0, or what amounts to the same thing, the 
potential energy (—U) has a minimum. We proved in [II, 19] that the equi- 
librium position g, = ... = g, = 0 is stable in this case, and for small initial 


excitations the system performs small vibrations about the equilibriurn position, 
so that the g, remain small throughout the motion. We can, therefore, assume 
when investigating these small vibratious, that U reduces simply to second 
order terms, i.e. has the form 


fi 
—U= DS bing % (big = Bix) (167) 
i, k-1 


i, 


134 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [38 


Similarly, in the coefficients a, of (165), we can take g, = 0 approximately, 
so that the ay, are specific numbers. On applying all this to system (164), we get 
a system of n linear equations with constant coefficients: 


Oy Gt 2 Wg ee + Qin Ga Oy Gy + Oye Ge +t Din gn = 0 


ay 7 + ag Ga ++ Bag On + ber G1 bas Ge +--+ ban In = 9 


On OE One Yo + Onn Gn + Om I + One G2 +-- ++ Can In = 9- 


If we seek @ solution of this system in the form of harmonic vibrations of 
the same frequency and initial phase but with different amplitudes: 


Qn = A; cos (At + ¢) (k = 1, 2, ...,”), (169) 
substitution in (168) gives us a system of equations for the A, and 4: 
(by, — 4? Qy1) Ad + ig — 2 Gye) Ay +--+ (Orn — A? Gy) An = 0 


(21 — AP O,) Ay + (Beg — 47 g2) Ag +--+ F (Ban — A? Gen) Ag = 0 (170) 


(Bagi — 4? Gm) Ar tH (One — 47 Oe) Ap +--- (Onn — 7 Ong) Ay = O- 


The existence of a non-zero solution for the A; requires the vanishing of the 
determinant of this system: 


By — A? yyy Byz A? Gye, ~~ -5 Big — FF ayy | 


be, — A? Gay, Dan — A* Ogg, ---1 bag — Ae Gy 


| 
| 
eae PESO (171) 


I Dip Who Bie Fi a ieee Feige 


On taking a root of this equation and substituting in the coefficients of system 
(170), we get one or more solutions for the A,, which we can then multiply by 
arbitrary constants. Moreover, (169) contains the arbitrary constant ¢. 

We get a clearer solution of the problem by applying the theory of quadratic 
forms. We first notice that quadratic form (165) in the variables g;, is positive 
definite by its nature, inasmuch 4s it gives the kinetic energy of the motion. 
The problem furthermore implies that form (167) is positive definite. As 
we have seen, we can bring in new variables p; by means of @ linear transforma- 
tion with constant coefficients of the old variables q, such that both the forms 
T and (—U) becomes sums of squares, where the coefficients of the squares 
in the case of T must be unity. We notice here that a linear relationship for 
the 7p; and g;, leads to the same relationship between p;, and g;,. We thus have 

mt 
T= Spt; —U= Sip, (172) 
s=] 


s=1 
where all the coefficients of the pe are positive, so that we can write them as 


squares. Along with (168), we can write the Lagrange equation (164) for the 


new variables: 
a 82 298 
dt [Op, J Op, 


39} EXTREMAL PEOPERTIES OF EIGENVALUES OF QUADRATIC FORMS 135 


On substituting from (172), we get the extremely simple system: 
pi + Ak p, = 0 (A = 1,2, ...,2)- 
The solutions of this system are 
Dy, = Cy cos (A, t + py) (k =1,2, ..., 2), (173) 


where C;, and y, are arbitrary constants. The generalized coordinates p, are 
called the principal coordinates of the mechanical system. 

The original coordinates g, are given in terms of these by a linear transforma- 
tion with constant coefficients. It follows at once from the results of the previous 
section that the 4, must be the roots of equation (170). We remark that some 
of the roots may be equal, though even in this case (169) still gives the general 
solution of the problem of small vibrations within the context considered. 


39. Extremal properties of the eigenvalues of quadratic forms, We consider 
the reduction of a quadratic form to a sum of squares from a new point of view. 
We confine ourselves to the case of three variables for the sake of simplicity: 


- 3 3 7 
g= >, Oj, Tj Ly = > Ay oe, (174) 
i,k=1 k=l 


where the 2}, and 2; are related by an orthogonal transformation: 
Dy = by Ty + dg By + Dig 2s 
Ly = bay Ty + Dee Zp + De5 (175) 
Ly = by Ty + bgg Ty T byy 5. 
We shall suppose for definiteness that the A, are decreasing, i.e. 
Ay > Ay > Ay. (176) 


Our problem consists in determining the numbers 4, and coefficients 6,, 
for values of the form p on the unit sphere K, i.e. on the sphere with centre at 
the coordinate origin and unit radius: 


wtataj=l or afPtafvtee=l. (177) 


Each point of the sphere characterizes a certain direction in space, defined 
by the unit vector drawn from the origin to the point. We can write (174) 
as 


p= A, (ay? + wf + 23) + (Ay — A) we + (A — 7) as, 
whence it is clear that we have on the unit sphere K: 
py + (Ap —A;) 22 + (2g — A) 


It follows at once from this that 4, is the maximum of ¢ on KE. 
The maximum is obviously obtained at the point 


a=1; 2,=—2,=0, 
or, by (175), at the point of K with the old coordinates 


=by3 Te= by; y= Og- 


136 LINEAR TRANSFORMATIONS 4ND QUADERATIC FORMS [39 


This point defines the vector corresponding to the first column of orthogonal 
transformation (175), ie. the vector is the solution of the equation 


Ax=ix (178) 


with A= 4,. Thus the eigenvalue of first magnitude of quadratic form (174) 
is equal to the maximum of the form on the unit sphere, whilst the correspond- 
ing eigenvector x‘ is the solution of (178) which runs from the origin to the 
point of the unit sphere where the maximum occurs, 

We now turn to finding the second eigenvalue and corresponding eigenvector. 
Let 2] = 0 in the formulae. In this case we have the equation of a plane passing 
through the origin and perpendicular to the vector x). The intersection of 
this plane with the unit sphere is the circle 

af pagal. 

We have on the circle: 

GAs x? + 4; 22, 


whence it is immediately clear that A, is the maximum of ¢ on the unit sphere 
on condition that the corresponding vector is perpendicular to the x‘ already 
found. We can show in the same manner as above that the corresponding vector 
x®, ive. the solution of (178) for 4 = 4,, is the vector drawn to the point at 
which the maximum occurs. 

Having obtained the two vectors, the third, x), follows from the fact that it 
is perpendicular to both, whilst the eigenvalue 4, is the value of the form ¢ 
at the point of the unit sphere where it intersects x), 

If we had say A, = A,, our search for the first maximum of g on the unit 
sphere would lead us to an entire circle where the maximum is obtamed, instead 
of a point. 

The above discussion is easily carried over to the case of any number of 
dimensions. We shall merely state the result, which is completely analogous 
to the above. Suppose we have the real quadratic form in 7 variables: 


n 
P= DS Yin Ti T- (179) 
i, kel 


A unit vector in real n-dimensional space is given by a set of real numbers, 
the sum of the squares of which is equal to unity. We shall say that the ends 
of these vectors lie on the unit sphere, the equation of which is obviously 


af-tol+...foi=1. (189) 


The highest characteristic root of the form ¢ will be the maximum of 
on the unit sphere (180), and the corresponding eigenvector is x(?, drawn 
from the origin to the point where the maximum occurs. To get the next lower 
characteristic root, we consider the unit vectors perpendicular to the x‘) 
already found. There will be an x®) among these, yielding the greatest value 
of y. This second maximum 4, is equal to the second eigenvalue of the form, 
whilst x) is the corresponding eigenvector. We now consider the unit vectors 


d 
40] HERMITIAN MATRICES AND HERMITIAN FORMS 137 


perpendicular to x) and x®, which is equivalent to associating with condition 
(180) the two further conditions: 
(x®,x)=0 and (x,x)=0. 


One vector among these again yields a greatest value of 9, this being the 
eigenvalue 4, that comes third in magnitude, the vector in question being 
the corresponding eigenvector, and so on. 

We could have arranged the eigenvalues of the quadratic form in mcreas- 
ing, instead of decreasing, order, so that the first would be the least, the next 
the second higher, and so on. This would lead to a precisely analogous problem 
to the above, except that a reference to least value would have to be substituted 
for every reference to greatest value. 

All the above arguments may likewise be generalized to the case of simul- 
taneous reduction of two quadratic forms to the sums of squares. Let the two 


quadratic forms 


mn n 
P= SY Gye; P= DY bind; Ly 
i, kel i, kat 


reduce to the sums of squares: 


with the aid of the linear transformation 
(ty, ---»2,_) = B (zy, ---, 2a), 


the A, above being assumed to occur in decreasing order. 
With this, A, is the greatest value of y on condition that 
¢g=1, 
this greatest value being in fact obtained for 


2 == by; Te by5-..3) ys Bn. 


The succeeding eigenvalues may be similarly determined. 


40. Hermitian matrices and Hermitian forms. We have considered 
real symmetric matrices in the above sections and have noted that 
they represent a particular case of Hermitian matrices in which the 
elements are complex numbers satisfying 


Ogi = Dix: (181) 


Setting ¢ = k, this relationship shows that the diagonal elements 
dg, aust be real. 

An alternative definition of Hermitian matrix is as follows: an 
Hermitian matrix is unchanged if its rows and columns are inter- 


138 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [40 


changed and all its elements are replaced by their conjugates, i.e. in 
the notation of [26]: 


AM=dor 4A=A. (182) 


As already mentioned, A is called the Hermitian conjugate of A. 
Hermitian matrices are therefore alternatively described as self- 
conjugate. 

We proved above [32] that, for any vectors x and y, an Hermitian 
matrix A satisfies 


(Ax, y) = (x, Ay). (183) 


This relationship can serve like the two previous ones as a definition 
of Hermitian matrix. 

The following further property should be noticed. 

Let A be an Hermitian matrix and U any unitary matrix. Then 
we can easily show that U-! AU is likewise Hermitian. We have 
A* — A by hypothesis. We want to show that U-! 4U has the 
same property. From [26]: 


(0-1 AU) = UM 4AM TO. 


and this gives us, in view of the hypothesis for A and the unitary 
nature of U which implies U — U-1; 


(U-F A) = U1 AU 
which is what we required to prove. 
We can say that, for any unitary transformation of coordinates 
which is embodied for vector components in the expression 


(2, = -1%,) =4 U (x, eee +> Zn), 
an Hermitian A as operator of a linear transformation of space 
appears in the new coordinates as U-1 AU, so that the above pro- 
position can also be stated as: unitary transformations of space do 
not change the Hermitian nature of a matrix as operator. 


We now consider the problem of reducing an Hermitian matrix to 
the diagonal form with the aid of a unitary transformation: 


6 fa” Cs en See (184) 


The problem is equivalent, as above in the case of real symmetric 
matrices, to the solution of an equation of the form 


Ax = dx, (185) 


40] HERMITIAN MATRICES AND HERMITIAN FORMS 139 


where / is one of the A, and the components of the vector x give the 
elements of the corresponding column of U. 
The numbers 7, and corresponding vectors x 
eigenvalues and eigenvectors of matrix A. 
As we know, the eigenvalues must be the roots of the equation 


“© are known as the 


| yy — A, Qa, + Qin | 
! i 
(av aa — A «=» Gan '=0. (186) 
[eee tee 
} | 
| 2nis na, ++ Onn — 4) 


Let 4= 4, be a root of this equation, and x™ be a solution of 
equation (185) with 2 = A,. 

Since (185) is linear and homogeneous, a solution can be multiplied 
by an arbitrary constant, and we can therefore take the length of x“ 
as unity. We take this vector as the first of the fundamental set in 
the new coordinates, then suitably complete the fundamental set 
with a further (x — 1) orthogonal unit vectors. Let U, be the unitary 
transformation corresponding to passage to the new fundamental set. 
Our Hermitian A becomes the new Hermitian matrix 4, = U,-1 AU, 
in the new coordinates, whilst the corresponding equation 


A,x=hk 


will have the vector with components (1,0, ...,0) as a solution 
with 4 = 4,. This fact shows us, as in [33], that all the elements of 
the first row and column of A, must vanish except the element A, 
at their intersection. 

It follows at once from the fact that A, is Hermitian that 1, must 
be real, and hence that every root of (186) must be real, as we saw 
above. The matrix A, can now be written as 


| 4,,0,..-,0 1 
0 af, 2. -, aD j 
i i 
0,0, ...,a |i 


i.e. It is a quasi-diagonal matrix of the form 
(A, Gy], 


where C, denotes the Hermitian matrix of order (n — 1) with elements 
a, We can now repeat the above argument and reduce C, with 
the aid of a unitary transformation U, on all but the first of the 


140 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [40 


fundamental set to a form such that all the elements of the first 
row and column vanish except for the element at their intersection. 

We can consider this latter unitary transformation as acting on 
the entire n-dimensional space and as given by the quasidiagonal 
unitary matrix 

[1, U.]. 
As a result of this transformation our Hermitian matrix becomes 
[1, U.}-* [4,, C,] (1, U2] = [A,, Uz* C, U2), 


and the new Hermitian matrix will have the expanded form 


14,0, 0,..-,0 
0, A, 0,...,0 
0, 0,a@, ...,a@|I- 


10, 0,4, ...,a@ 


By continuing in this way, we successively reduce our Hermitian 
matrix to the diagonal form, the total unitary transformation U 
appearing in (184) being the product of all the successive unitary 
transformations. 

We return to equation (185). We proved in [33] that its solutions 
corresponding to the different values of 4 must be mutually orthogonal. 

We can show exactly as in [83] that the vectors formed by the 
columns of matrix U, together with the corresponding values of 4, 
yield all the solutions of (185). We only need to bear in mind here 
the following important fact regarding multiple roots of (186). If A = 4, 
is say an m-tuple root of (186), (185) will have m linearly independent 
solutions x) sine x™ for A= 4,. Every linear combination of these 
with arbitrary coefficients will obviously also be a solution of (185), 
ie. the equation 


Ax=4,x 
has a set of solutions representing the subspace formed by the 
vectors x™, ...,x™ or in other words, defined by the sum 


z=Cyx +...4+ 6, x™ 


with arbitrary coefficients C,, ...,C,,. We can select in any manner 
a system of m mutually orthogonal unit vectors in this subspace, 
such that their components give the columns of the matrix U that 


40) HEEMITIAN MATRICES 4ND HEEMITIAN FORMS 141 


corresponds to the eigenvalue 2 = A,. This means that we have here 
the same arbitrariness in the choice of U as we had in [33] for B. 
Moreover, we can obviously multiply the components of every x, 
obtained by solving (185), by a numerical factor of unit modulus, 
ie. by a factor of the form e’? (the phase factor). The vector stil] 
retains its unit length after the multiplication, as well as its ortho- 
gonality to all the other vectors appearing in the complete system 
of solutions of (185). Finally, we can arbitrarily change the order 
of the columns in JU. This is a trivial transformation that clearly 
amounts to re-numbering the fundamental set in the new coordinate 
system and merely involves a rearrangement of the A, in the diagonal 
matrix. We shall always assume in future that the 4, are in increasing 
order. 

We now turn to Hermitian forms. We shall say that the Hermitian 
form 


a 


A (x) = (Ax, x)= > a7; 2,, (187) 
1, k=1 
where 2, ---,Z, are the components of a vector x, corresponds to 


the Hermitian matrix A. We have previously looked on matrix A 
as a linear transformation of space which yields a new vector x’ on 
being applied to a given vector x, and we have written the result of 
this transformation as Ax. In the expression A(x), the final result 
is no longer a vector, but a number. We saw above that this number 
is real. 

Now suppose we have carried out a unitary transformation of the 
space, the old vector components being given in terms of the new 
by x = Ux’. The Hermitian form (187) becomes in the new coordinates: 


(AUx’, Ux’). 


Property (125,) of unitary transformations enables us to multiply 
both vectors in this scalar product on the left by the unitary matrix 
U-}, so that we can now write for Hermitian form (187): 


(U-1 AUx’, x’). (188) 

In particular, if the unitary U transforms A to the diagonal form, 
i.e. (184) is valid, only the terms containing the products z} + 2} will 
remain in our Hermitian form in the new variables, and our form 


will have been reduced to a sum of squares: 


(x’-O-! AUx) =A, 22, +4, 2,2,+ ... +4, 2,55. 


142 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [40 


Thus the task of transforming a matrix A to the diagonal form is 
equivalent here, as in [32], to the task of reducing the corresponding 
Hermitian form to a sum of squares. 

Instead of Hermitian forms, bilinear forms are sometimes considered, 
these being defined by 

nn 
(Ay, x)= DS x ZiYn- 
i, k=1 

If we again apply a unitary transformation to the space so that 
the new components are given in terms of the old by the previous 
formula, we have in the new coordinates: 


(dy, x) = (AUy’, Uy’) 
or, by the property of unitary transformations: 
(U1 AUy’, x). 


Finally, if U reduces A to the diagonal form, the bilinear form 
reduces in the corresponding coordinates to the following simple form: 


We notice that any diagonal matrix with real elements is Hermitian, 
so that U-1[A,, ...,4,]U, where U is any unitary matrix, is also 
Hermitian. We saw above that, conversely, any Hermitian matrix 
can be written in this form. 

Hermitian forms may be classified in the same way as real quadratic 
forms [35], according to the signs of the characteristic numbers ,. 
If all the 4, are positive say, the Hermitian form is said to be positive 
definite. It is characterized in this case by being positive for any 
values of 2, and by vanishing only when z, = ... = 2, = 0. We can 
similarly define semi-definite and alternating Hermitian forms. 
The discussion follows exactly the same lines as for real quadratic 
forms and is based on the expression 


(Ax; a) =A) 2 et ws PA 


Equation (183) holds for Hermitian matrices. Given any matrix A 
and its conjugate A = A“, we have instead of (183): 


(Ax, y) = (x, Ay). (183,) 


If A has elements igs A has elements {A hin = Qui, and (183,) may 
be verified by direct substitution as in the case of (183). 


41J COMMUTATIVE HERMITIAN MATRICES 143 


41. Commutative Hermitian matrices. Let A and B be two Hermitian 
matrices. We consider the conditions under which their product BA 
is likewise Hermitian. We write down the Hermitian conjugate 
of BA: 

(BA\* = A) BO 


or, since A and B are Hermitian: 
(BA)® = AB. 


The necessary and sufficient condition for BA to be Hermitian 
is for AB to coincide with BA, i.e. for the matrices to commute. Suppose 
that the Hermitian matrices A and B are reducible to the diagonal 
form by means of the same unitary transformation U: 


A=U-71 [A,, uke A] UC; B=U-1 [41 Hs +s Hal U. 
It can easily be seen that they commute in this case: 
AB — BA = U-1 [41 M1 ane Ss Agpln| U. 


We now prove the converse: if two Hermitian matrices commute, 
they can be simultaneously reduced to the diagonal form with the aid 
of the same unitary transformation, i.e. commutation of Hermitian 
matrices is not only a necessary, but also a sufficient condition for 
them to be reducible simultaneously with the aid of a unitary trans- 
formation to the diagonal form. Suppose, then, that 4B = BA. 
We notice that similar matrices to these will also commute. For 


(C1 AC) (0 BC) = C1 ABC =C1BAC, 
and the same expression is found for the product 
(C1 BC) (C1 AC). 


Suppose we choose for C a unitary transformation that reduces A 
to the diagonal form, and that we apply the same transformation to B. 
Since the new matrices commute, we can assume in the proof of 
our proposition that A in fact already has the diagonal form, i.e. 
its elements a; satisfy the condition 


Let us denote the elements of B by 6, and write down the condition 


that the matrices commute: 


n n 
D> Ais ogy = > dis Bsn (t, k= 1,2,...,n). 
s=l s=1 


144 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [41 


for any 7 and k. This becomes, by (189): 
(2; — Ay) by, = 0 (1, &=1,2,.--, 2). (190) 


If all the a;; are different, these last equations at once imply that 
bi = 0 fort # k,ie. Bis also adiagonal matrix, and the proposition 
is proved. 

We now turn to the general case, when some of the a;; are identical 
We may suppose for definiteness that they fall into two groups of 
equal numbers: 


Oy, = +++ Emmi Omgi,mti = +-- =4nn- 

It follows at once from (190) that in this case the by, can differ 
from zero only when either both subscripts ¢ and & are greater than 
m, or when both are not greater than m. This means that B must here 
be quasi-diagonal: 

B=[B, BJ, 


where B, and B, are Hermitian matrices of orders m and (n — m) 
respectively. We can write B out in full as 


Biss a 4p bene®, 0 | 
ee Ne ee, | 
0, cose 0, onetime er J bn+1, n 

OF, eG. y Bi teesyt = oes | 


We can subject the subspace formed by the first m fundamental 
vectors to a unitary transformation without changing the diagonal 
form A, and the same is true for the subspace formed by the succeed- 
ing (n — m) vectors. We choose these unitary transformations V; 
and V, so that B, and B, become diagonal. Altogether we have a 
unitary transformation of the n-dimensional space with the quasi- 
diagonal form 

[V,, Vo]. 


The matrix A remains diagonal in the new coordinates by what 
has been said above, while matrix B takes the form 


[Vy Vo}-* (By, Bo] (Vi, V2] = (Vir* By Vy, Ve* ByVal, 


i.e. is also diagonal, so that our proposition is proved. 


42] THE REDUCTION OF UNITARY MATRICES TO THE DIAGONAL FORM 145 


If we now write the equations 
Ax=d’x; Bxr=px, (191) 


for our commuting matrices, it follows immediately from the above 
that we can form the same system of 7 linearly independent solutions 
for both the equations. These solutions in fact give the columns of 
the unitary matrix U that reduces both matrices to the diagonal 
form. In other words, we can form the same complete system of n 
linearly independent eigenvectors for two commuting Hermitian 
matrices. The eigenvalues, i.e. the values of the parameters / and yp, 
are of course generally different. We remark that it does not follow 
from the above that every eigenvector of A is likewise an eigenvector 
of B. This would be the case, of course, if all the eigenvalues of A 
and B were distinct, so that a single vector, apart from a numerical 
factor, corresponded to each value 4, and y;. But this is not generally 
true if some of the eigenvalues are equal. Let x“ be the total system 
of eigenvectors of matrices A and B, whilst A, and py; are the corre- 
sponding eigenvalues. Suppose, say, that 2,=4, but mu, # py. The 
vectors OC, x + 0,x® are now, for any choice of constants C, and 
C,, eigenvectors of A but not of B. 

The whole of the above discussion is easily carried over to the 
case of any number of matrices: given Hermitian matrices A,, ..., Aj, 
the necessary and sufficient condition for them to be reducible 
simultaneously to the diagonal form with the aid of a unitary trans- 
formation is for them to commute in pairs, i.e. A;A, = A,A; for 
any i and & from 1 to J. 


42. The reduction of unitary matrices to the diagonal form. Unitary 
matrices have a similar property to Hermitian matrices as regards 
reduction to the diagonal form: if V is any unitary matrix, a unitary 
matrix U can always be found such that 

C10 
is diagonal. We can write the problem in the form 
VO SjU Aas Als (192) 


where U is a required unitary matrix and the 4, are required numbers. 
As in the earlier case of Hermitian matrices, the vectors x“ 
corresponding to the columns of U must be solutions of the equation 


Vx = ix, (193) 


146 LINEAR TRANSFORMATIONS AND QUADBATIC FORMS [42 


where 4 is any of the 4,. It follows at once from this, as above, that 
the 4 must be the roots of the characteristic equation 


= : . | 
eel A, yp es Un 
| y) 
v Ugg — A, ++, Von 
CO ue ie eae = 0: (194) 
| Unis Pra, tity Un A | 


where the elements of V are written vj. 

We notice first of all that if V, and U, are unitary, U-1V,U, is 
likewise unitary. For since U, is unitary, U;1 is unitary, and the 
product of unitary matrices is also unitary. 

After substituting a root 2 = 4, of equation (194) in (193) and 
finding the unit vector x™ satisfying (193), we take this as a new 
fundamental vector and associate with it a further (n — 1) unit 
vectors such that we have a system of n mutually orthogonal unit 
vectors. Passage from the old to the new fundamental set is equivalent 
to a unitary transformation U,, and our unitary matrix V becomes 
the similar matrix 


V,=U07V0,. 
The equation 
has the vector with components (1, 0, ..., 0) as a solution for 2 = 4, 


whence, as above, it follows immediately that the elements of the 
first column of V, are all zero except the first, which is equal to 4,. 
But since, in a unitary matrix, the sum of the squares of the moduli 
of the elements of each column is unity, the number 4, can be said 
to have a modulus of unity. We now recall that, in the unitary matrix 
V,, the sum of the squares of the moduli of the elements of each 
row must likewise be unity. But we have just shown that 1,, the 
first element of the first row, has unit modulus, so that the remaining 
elements of the row must be zero. Thus our unitary transformation 
has reduced our unitary matrix to the form in which all elemente 
except the first of the first row and column are zero: 


Bie Dy teks AO 


We had the same situation previously for Hermitian matrices. 
The elements v{}?) now form a unitary matrix of order (n — 1). We can 


42] THE REDUCTION OF UNITARY MATRICES TO THE DIAGONAL FORM 147 


apply a further unitary transformation so as to obtain zeros in the 
first row and column of this matrix, except in the case of the first 
element, the modulus of which will be unity. As a final result of our 
two unitary transformations, the unitary matrix becomes 

Ay 0, Oy seas OF, 


Os Deg SOs heya 0 fl 
10, 0,0, ..., 08 J 
: Sie Sow be ei RP ROE i! 

i 
10, 0, vo ..., 2@ | 


By continuing in this manner, our unitary matrix is reduced to 
the diagonal form with the aid of a certain unitary transformation. 
We remark that it follows at once from the above discussion that ai? 
the characteristic roots of a unitary matriz have unit modulus. 

It can be shown as in [41] that if any number of unitary matrices 
commute in pairs, they can all be reduced to the diagonal form with 
the aid of the same unitary transformation. 

We also notice the following point. Let a unitary matrix reduce 
a matrix A to the diagonal form, ie. U-1 AU is diagonal. We know 
that the modulus of the determinant of U is unity, so that we can 
find a real number w such that the determinant of the unitary matrix 
e’” U is unity. But e’ U also reduces A to the diagonal form, since 


(e!@ U)-1 A (el U) = ee U1 AU = U-1 AU. 


It follows that we can always take the determinant of a unitary 
matrix U reducing a given matrix to the diagonal form to be equal 
to unity. 


Example. We take as an example the reduction to the diagonal form of a 
real third order orthogonal matrix: 


[ 
1 tin tr Pg 

1 = 
ee Ge, Vea M95 || - (195} 
|j Par Usa ss 


We shall essume that the determinant of this matrix is equal to (+1), so that 
the matrix corresponds to a movement about the origin of the three-dimensional 
space asa whole. The characteristic equation for matrix (195) has a constant term 
equal to unity by hypothesis, since the constant term evidently coincides with 
the determinant of the matrix. We have seen, on the other hand, that all the 
roots of the characteristic equation have unit modulus. The first term of the 
characteristic equation will be (— 4)3 = — A3, and therefore the constant term viz.. 
unity is identical to the product of the roots. Since the equation has real coef- 


148 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [42 


ficients, only two cases are possible: either one root is equal to 1, whilst the 
other two are imaginary conjugates of unit modulus, ie. of the form e*’, 
or else one root is 1, and the two others are (— 1). The second case is a particular 
ease of the first with p = x. 

The real vector x‘, corresponding to the eigenvalue 4 = 1, must be a 


solution of the equation 
Vx) a= x(), (198) 


In other words, this vector must not change with the rotation of space defined 
by matrix V. The vector is real since it corresponds to the real value 4 = 1, 
and it evidently defines the axis about which the space rotates (any rotation 
of space about the origin is equivalent to rotation about some axis through the 
origin). To find the components of x‘? in terms of the elements of matrix V, 
we re-write (196) as 

V-1 x) = x) 


or, since V is real and unitary, we can write 
V(*) <@) — x6), 
We have on subtracting this from (196): 
(V — VO) x) 0. 
We write this last equation out in full, the components of xo being denoted 
by (u,1, U2), Uy). This gives the system 


(12 — G21) Ua, + (13 — Por) Ua, =O 
(Pa, = Uys) Uy + (G23 — Ug3) Ug, = O 
(5, — Uys) Uy1 F (ge — Vgq) Zor = 0, 


whence the formula for the direction of the axis of rotation follows at once: 
Uyy 2 Ugy 2 gy = (Ugg — se) = (Pa — Ma) ¢ (Mz t — Vey)- 

The two other eigenvectors x® and x‘:) must clearly satisfy the equations 

Vx) = eimx(2) and Vxla) = en ip x(2), (197) 


and these vectors now have complex components. We can find g from the 
condition that the sum of the roots of the characteristic equation is evidently 
equal to the sum of the diagonal terms, i.e. to the trace of V: 


1 fei? + el? 1 4 2 c08 P= Oy, + Vo + Ogg, 


where @ can be assumed to lie between 0 and x. 

Since the values of A in equations (197) are imaginary conjugates, it follows 
that we can assume that the components of x and x are imaginary conjugates. 
We form the new unitary matrix 


| 1, 0, 0, | 
lige ok a| 
oa " ¥2" y2 j- (198) 
eae | i } 
1% Vr yr! 
i {i 


43] PROJECTION MATRIOES 149 
It may easily be verified directly that the elements of the columns of the 
matrix W = UU, are equal to the components of the vectors 
x@) + x(3) _ x2) — x(a) 
an ae oes > ake 


i.e. they are real. Moreover W must also be unitary since it is the product of 
two unitary matrices, i.e. W is orthogonal. We now use the real unitary matrix 
W to apply a aimilarity transformation to V. This gives 


W3VW =U; U4 VUU, = U5" [1, e®, e-¥] Uy. 


xf) ; 


On carrying out the actual matrix multiplication, we get 


1, 0, 04; 
W—VW = || 0, cos g, — sing ||. (199) 
0O,sing, cos@ 


We can always suppose that the determinant of the orthogonal matrix 
W is(-+-1), since we could multiply the matrix by (—1) if this were not the 
case, as a result of which (199) would remain unchanged. Hence W will also 
correspond to a rotation of three-dimensional space. Matrix (199), obtained 
as @ result of the coordinate transformation x’ = Wx, is similar to V and yields 
the same transformation in the new coordinates as the original matrix V gave 
in the old. It follows directly from the forrn of matrix (199) that this corresponds 
to a rotation about a new axis x by an angle ¢, and the essence of our trans- 
formation amounts to our having used as axis x“! the above-mentioned axis 
of rotation represented by the vector x‘), 

A further important fact follows at once from the above: all the real matrices 
corresponding to a rotation of space by a given angle ¢ can be reduced to the 
same form (199) with the aid of a similarity transformation (different for dif- 
ferent matrices), so that such matrices are similar to each other. 

The matrices corresponding to different angles of rotation cannot be similar, 
since the different values of y lead to different sets of characteristic roots 1, 
e'?, e- *_ All these properties have an extremely simple geometrical inter- 
pretation. 


43, Projection matrices. We shall now consider a particular case of Hermitian 
matrices. Let R,, be the m-dimensional subspace formed by the linearly 
independent vectors y', ..., y'". The subspace Ry, consists of the set of 
vectors of the form 

Cy) +... + Cpy™, 
where the C, are arbitrary numerical coefficients. We can orthogonalize the 
y and form m mutually orthogonal unit vectors 


x. 4., xi") 


which yield the same subspace R,,. Then we can make these into a complete 
system of n mutually orthogonal unit vectors by constructing a further (n — mm) 
unit vectors 

xD, xl, 


150 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [43 


These last vectors form an (n — m)-dimensional subspace R,_,,, the two 
subspaces F,, and E,,_,, being mutually orthogonal in the sense that any vector 
of the former is orthogonal to any vector of the latter [14]. On writing any 
vector x in terms of the fundamental set x”: 


x= ox) 1... tz xl, (200) 
we Can represent it as the sum of two vectors: 
x= (2x0 +. pagel) + [tga +... zx =a+v, (201) 


one of which belongs to #,, and the other to Rp_ m. It is easily seen that this 
resolution of any vector x into two components is unique. For suppose that, 
in addition to (201), we have a second resolution x = u’ + wv’ withthe above 
property. Then 


+? or a-—wvd=v’'—y¥. 


The vector on the left belongs to R,, and that on the right to Rj, so that 
u —u’ and vy — vy’ are orthogonal. 

But any vector orthogonal to itself is clearly zero [14], and consequently 
u—u’=0, ie. a is the sameas vu’, whilst v is the same as v, Le. n and v 
are uniquely defined for x. The vector u is called the projection of x on the sub- 
apace R,,. The matrix for passing from x to a is called the projection matrix 
on to the subspace #,, and is written Pp. The form of this matrix naturally 
depends on the choice of coordinate axes. 

If we take the x as fundamental set, x is given by (201), whilst u is given 
by 

a=2,x04 ... +2,x!, 


and the operation of projection here amounts simply to leaving the first m 
components as before and putting the remainder equal to zero. The correspond- 
ing projection matrix is clearly diagonal: 


Pry =(1, 1,.--, 1,0, 0,.-., 0), 


where we have unity in the first m places and zero elsewhere. If the fundamental 
set were numbered differently, we should still get a diagonal matrix of ones and 
zeros, though in a different order. In the general case of any choice of Cartesian 
axes, the projection matrix has the form: 


Pr, = G7 [ly eies 1, 0,220 O)U, (202) 


where JU is unitary, and the eigenvalues of Pp are either zero or unity. Con- 
versely, every Hermitian matrix of this form is a projection matrix on to a 
subspace whose number of dimensions is given by the number of eigenvalues 
of Pp, equal to unity. 

A projection matrix can be alternatively defined as follows: a projection 
matriz 13 an Hermitian matrix satisfying the equation 


Pt=P, (203) 


43] PROJBOTION MATRICES 151 


For we can easily verify that a matrix of the form (202) satisfies relationship 
(203) on noticing that 17 = 1 and 0? = 0. Conversely, if an Hermitian matrix 
satisfies (203), and we write it 


PHU" [hy --- 4g] U, 
we have by (203): 
U7 [22,..., a2] U = Ut fA, .. Ag] CU, 


ie. At = Ay (k = 1, 2, ..., 2), whence it follows at once that 4, is unity or zero. 
If all the characteristic roots of the matrix are unity, we have the unit matrix 
which corresponds to the identity transformation; in other words, a vector 
is projected on to the total space (and remains unchanged). Apart from this 
trivial case, we have at least one zero characteristic root in the projection 
matrix, so that the determinant of the matrix, equal to the product of the 
characteristic roots, is also zero, i.e. there is no question of an inverse matrix 
P-}, We notice that it also follows directly from the definition that the projec- 
tion matrix Pp does not change a vector belonging to the subspace FR, 
and diminishes the length of a vector not belonging to R,,. 

We follow these preliminary observations by considering some operations 
with projection matrices. Let us have two projection matrices Pp and Ps 
such that their product is zero, i.e. all the elements of the product matrix are 
zero: 

PsPp=0. (204) 

We take a vector x of the subspace FR, such that Pex =x. Equation (204) 


gives us 
Psx =0. 


But it follows directly from this that x is orthogonal to any vector of the 
subspace S. For otherwise we could find a unit vector y of S not orthogonal 
to x and on taking this as the first of the fundamental set, we should have 
a non-zero magnitude for the first component of x which would remain unchanged 
on projection of x on to S. Hence we see that, if condition (204) is satisfied, 
every vector of R is orthogonal to every vector of S, and conversely. We now 
have, in addition to (204): . 

PpPs=0. (205) 


For, given any vector y, the vector Ps y belongs to S and is therefore ortho- 
gonal to every vector of R, i.e. we have for any y: 
Pp P sz 0, 


which is equivalent to (205). Conversely, if two subspaces R and S are ortho- 
gonal in the above sense, (204) and (205) are valid. 
We now consider the sum of the projection matrices: 


P= Prt+Ps (208) 


and assume that (204) and (205) are satisfied. We show that (206), which is 
clearly Hermitian, is also a projection matrix, i.e. we want to show that it is 
equal to its square: 


P? = (Pp + Ps) (Pa + Ps) = PR+ PrPs+PsPp + PS, 


152 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [43 


from which we have, in fact, in view of our conditions and the fact that Pp 
and Ps are projection matrices: 


Pt = Pp+Ps=P. 


It may easily be seen that P corresponds in the present case to @ projection 
on to the subspace (# + 8S), where the addition of FR and S is taken to mean 
the subspace consisting of all the vectors which are the sums of vectors of 
F and of S, io. if R is formed from x, ...,2, and § from y®, ..., 7", 
(# + S) consists of the system 

Cx) +... +E, x) + Dy +... + Dy, 
where the C; and D, are arbitrary constants. The above property may be 
generalized for any number of terms: 
P=Ps,+.--- + Psm- (207) 

If the subspaces S;, are orthogonal in pairs, i.e. any vector of S; is orthogonal 
to any vector of S, for differing ¢ and j, sum (207) represents the projection 
Matrix on to the subspace (S, + ... + S,,), formed from all the vectors used 
for forming the S;. In particular, the sum can be equal to the unit matrix: 

IT=Py +... +Psm 


and we usually speak in this case of the resolution of the identity into projection 
matrices, or simply, of the resolution of the identity. 
We next consider the product of two projection matrices: 


P = Ps Pp. (208) 
For the product to be likewise a projection matrix, we first of all require it 
to be Hermitian which implies im turn [41] that the matrices commute: 
PpPs = PsPp. (209) 
This condition may be shown to be sufficient, i.e. P? = P in this case. We 
have ‘ 
or, on commuting the matrices in accordance with (209): 
P* = P&P = PsPp 


which is what we required to prove. It may easily be verified that, given com- 
mutation condition (209), matrix (208) corresponds to the subspace formed 
by the vectors common to the two sets that form R and S. 

We also notice a simple result, the proof of which we need not dwell on: 
if S forms part of subspace R, the difference 


P =Pp—Pzs (210) 


is also a projection matrix. If we take x™ ag the fundamental set forming S, 
we have to add one or more linearly independent vectors in order to get the 
fundamental set forming R. These added vectors themselves form a subspace 
7, and the projection matrix on to 7 is given by matrix (210). 


43] PROJECIION MATRICES 153 


By using projection matrices, we can state the problem of reducing a Hermi- 
tian matrix to the diagonal form in a precise manner even with the presence 
of multiple eigenvalues. 

Suppose, for instance, that we have the Hermitian matrix 


A=U[A,...; 4,)0-, 


where U is a unitary matrix. Suppose for definiteness that the 4, fall into two 
groups, the m of the first group being all equal to u, and the remaining (n — m) 
of the second group being equal to »: 


A=Uf[yp,...,4,%...,7] 07. 
We can evidently re-write our matrix as 
A=psU[I,..-,1,0,..., 0)U-2+47U(0,...,0,1,--.,1] 0-1. 
We now introduce the projection matrices 
Pre=U[l,..., 1, 0,...,0]U0-; Ps =U[0,...,0,1,..., U7. 


The corresponding subspaces # and S are obviously orthogonal, and addition 
of the projection matrices yields a unique matrix. We thus heve 


A=pPp+Ps, 


A= eee =A = ht and Aan Se SHAQ SD. 


where 


The problem of reducing an Hermitian matrix to the diagonal form amounts 
in general to a resolution of the indentity 


I=Ps,+...+ Ps) (211) 
such that A is expressible in the form 
A=4Ps,+.--+4mPs, (212) 


where the y, are the different eigenvalues of our matrix A. Thus to every 
Hermitian matrix there corresponds a definite resolution of the identity (211) 
such that the matrix is expressible in the form (212). 

All the above results can easily be translated into the language of Hermitian 
forms instead of matrices. For every projection matrix Pp with elements 
Pir we have a corresponding Hermitian form: 


Pp (a) = (Pre. z)= S pudiiry (213) 
i,k=1 


which is sometimes called an Hinzelform. If the corresponding subspace FR has 
m dimensions, and we take m mutually orthogonal unit vectors of # as the 
first m of our fundamental set, form (213) becomes in this coordinate system: 


(Pax’, x’) = xi al txsas+ oo. txnain 


We observe further that, if the matrices Ps, are the resolution of the identity 
given by (211), we clearly have, on choosing as fundamental set mutually 


154 LINEAR TRANSFORMATIONS AND QUADRATIO FORMS (44 


orthogonal unit vectors from each of the subspaces S;,: 
m Hd ci? 
2D Ps®)= J tt 
k= i=l 
and consequently the sum 
fied 
> PS,(x) 
k=1 


gives the square of the length of the vector for any choice of coordinate axes. 
We can therefore say that the problem of reducing an Hermitian form A to a sum 
of squares is equivalent to solving the two equations: 


m 

A (a) = 3) mPs(s) (214) 
m 

jz= 2 Ps,x). (215) 
k=1 


The introduction of the projection matrices thus allows of a statement of 
the problem of reducing an Hermitian matrix to the diagonal form without 
any special choice of coordinate axes. This in turn makes it possible to extend 
the above results, with suitable changes, to the case of space with an infinity 
of dimensions, which is the basic mathematical problem from the point of view 
of present-day quantum mechanics. We shall not discuss this till later. This 
extension to the case of an infinite set of dimensions carries us outside the realm 
of algebra and is intimately connected with the introduction of the apparatus 
of analysis. 


44, Functions of matrices. Matrices can take the role of the arguments 
of functions. We confine ourselves here to considering the most 
elementary functions, viz., matrix polynomials and rational fractions. 
A more detailed treatment of the theory of functions of matrices 
will be found later, after the theory of functions of a complex variable. 
A polynomial /(.4) of degree m in the variable matrix A has the form: 


f(4)=qteAt...+¢,4™, (216) 


where the c, are numerical coefficients. The value of the function is 
given by the matrix whose elements are evidently 


(f(A) bie = CoB + 01 {A}ig + ++ + om {A™ bins 
where 
Oj, = 0 for i#k and 6; =1. 


We can also consider a polynomial in several matrices but have to 
bear in mind the non-commutativeness of matrices on multiplication. 


44] FUNOTIONS OF MATRICES 155 


A second degree polynomial in two variable matrices A and B has 
the general form 


} (4, B) =e +64 + ¢,B + c3A? + cB? + ¢,AB + ¢,BA. 


We replace the A in (216) by the similar matrix U-1 AU. We have, 
on recalling that (U-? AU)* = U-1 A‘ U, 
fU7AU)=eq+¢,08IAU+...+¢,0U74A"U = 
=U-"“eg+e,A4+...+6¢,A7)U, 


ie. 
(U7 AU) =T04f( AU. (217) 


An analogous expression holds for a polynomial in several matrices: 
{(U AU, U-1 BU) =U 17(A, B)U. (218) 


We next dwell in rather more detail on the case of Hermitian 
matrices. If A is Hermitian, it follows directly from the definition 
that any positive integral power A“, and the product cA, where c 
is a real constant, are also Hermitian. Moreover, the sum of Hermitian 
matrices is Hermitian. Hence it follows directly that if A is Hermitian 
in (216) and the coefficients c, are real numbers, the value of the 
function f(A) is also Hermitian. The Hermitian matrix /(A) clearly 
commutes with A, and they can be simultaneously reduced to the 
diagonal form with the aid of some unitary transformation. We notice 
firstly that if we substitute a diagonal matrix [/,, ..., 4,] for A in 
function (216), we clearly get another diagonal matrix: 


m 


oy of At, tay An] = (f(A), coe f(4,)] ? (219) 


k=O 


where /(A;) is the numerical value of the polynomial on substituting 
the numbers A; for A. 
Now let V be the unitary transformation reducing A to the diagonal 
form: 
A= V [4s se dl VO 


We have by (217) and (219): 
H(A) = VUf(Ay), ---> FAW] V+, 


ie. V also reduces f(A) to the diagonal! form, the eigenvalues of this 
last being f(A,). 


156 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [44 


We now turn to rational fractions. Let /,(A) and f,(A) be two 
polynomials in the matrix A. We consider their quotient: 


1 A 
it 3 ; (220) 


We saw earlier [26] that the quotient of two matrices does not in 
general have a definite meaning. It may be shown in the present 
case, however, that (220) has a definite value provided only that the 
determinant of matrix f,(A) differs from zero. We can write (220) 
in two ways: 


fil A) f(A) or fal AJ-2f( 4) - 
We shall show that these two expressions are equal: 


f(A) f,(4)- = f(A) 7 AA), 
or what amounts to the same thing: 


14) fi) = Al 4) fal). (221) 

Since our polynomials contain only the single matrix A, they 

commute, i.e. (221) in fact holds, and quotient (220) has a single 

value. It is easily shown further that, in the case of a single matrix, 
rational fractions can be multiplied like ordinary fractions. For 


fA) IAD 4) fA FA) BAIA 


or, since the terms commute: 


PAA) LAY. f(A) fo A) Lal A) fal A) I2 = ADA. 


f(A) f(A) ~ #AA) EA) 
We take as an example the rational fraction 
_ 14iA 
unit (222) 


where A is an Hermitian matrix, ie. A® = A. It is easy to see 
that U is unitary, i.e. 
o# = U-, (223) 
For we have 
1—iA 


v= = 
1+%A 


=(1—i4y(1+44)>, 


whence we get, on passing to the transposed matrix [26]: 


0 = (14 t42 (1 — 74) = (1474) (1 — tA) , 


45] INFINITE-DIMENSIONAL SPACH 157 


or, since A® = A: 


1—tA 


Oo = (14 44)-71(1-— #4) = iva Ut, 


i.e, (223) is satisfied, and U is in fact unitary. 
We can write (222) in the form 
U (1 —tA) = (1+ 74), 
whilst the fact that U commutes with A by (223) means that 


(224) 


It can be shown, precisely as above, that if U is unitary and the 
determinant of the matrix U + 1 differs from zero, the matrix A 
defined by (224) is Hermitian. Hence any unitary matrix for which 
DU +1)40 can be written in terms of an Hermitian matrix A 
in accordance with (222), 


45. Infinite-dimensional space. We now set about. introducing the 
concept of space with an infinite set of dimensions. We need the pre- 
liminary idea of the limit of a complex variable. Let the complex 
variable z= z+ yi take the sequence of values: 

AHH tYit, ey = Met Ygt; --.3 Zn = Int Ynt; --- (225) 


We say that the complex number a = a + bi is the limit of sequence 
(225) if the modulus of the difference (a2 — 2,) tends to zero on 
indefinite increase of n, ie. |a@ — 2,|— 0 as n— , and we write 
a= limg, or 2,—>a. But 


|a— z,|=|(@— 2) + 0-4) §| = VO— a Oa. 


Since both terms under the radical are non-negative, the condition 


| a — Zz, |—> 0 is equivalent to the two conditions: z, > a and y, — b. 
Thus 
Lat Y,irat bi (226) 


is equivalent to z,—> a and y,—> 6. We consider the complex series: 
D> (+ B,8) - (227) 
k=l 

It is said to be convergent if the sum of its first n terms: 


= Sets test at o+e +5,)t 
k=l 


158 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [45 


tends to a limit: S,—>a- di on indefinite incrase of 7, the limit 
(a + bi) being called the sum of the series. It follows from this defini- 
tion that the convergence of series (227) is equivalent to the convergence 
of the series 


a= Sa, and b= Sb, (228) 
k=1 k=1 


formed from the real and imaginary parts of the terms of (227). 
Suppose that the series 


D leet | = > VaR + Ge, (229) 
k=1 k=1 


formed from the moduli of the terms of (227), is convergent. In view 
of the obvious inequalities 


|a,|< Yaz+62 and |6,| < fag+ &, (230) 


series (228) will now also be convergent and in fact converge absolutely, 
and series (227) is therefore also convergent, i.e. if series (229) is con- 
vergent, (227) is certainly convergent. Series (227) is said to be ab- 
solutely convergent in this case. By applying the usual Cauchy test, 
we can state the necessary and sufficient condition for absolute con- 
vergence as follows: given any small positive ¢, there exists an NV 
such that 


ntp 
> |a, + | <e, (231) 
kan 


where 7p is any positive integer and n > N. 
We now apply the above to some particular cases that are essential 
to what follows. We take the series 


Sa, 6,, (282) 


k=l 


where a, and f, are complex numbers, and where it is known that 
the series 


SlaxP and $1 x? (233) 
k=1 k=l 
are convergent. We use the inequality proved in [29]: 


n mt nt 
Sa. ill < Saat 16 . 
k=n k=n k=n 


45] INFINITE-DIMENSIONAL SPACE 159 


We obtain from this, on taking into account the convergence of 
series (233), the fact that the sum 


n+p 


= | Bx 


k=n 


is as small as we please for large n and any p, i.e. the convergence 
of series (233) guarantees the absolute convergence of series (232). 
We now consider 


Dla + Bel? = > (a +B) (Ge +B); (234) 


k=1 k=1 


series (233) being assumed convergent as before. Series (234) can be 
written as the sum of four series: 


= 


co) oo 2 co _ 
| |?5 >! Bel? > &% By > % B;- 
k=l k=l k=l 


The first two are convergent by hypothesis, whilst the last two are 
convergent in view of the proposition proved above, i.e. the con- 
vergence of series (233) implies the convergence of (234). 

We now turn to space with an infinite number of dimensions. 
A vector in this space is defined by an infinite sequence of complex 
numbers: 


x(%%, %,---), 


these numbers being always assumed subject to the condition that 
the series 


> 1%? (235) 
Pal 


is convergent. The aggregate of such vectors is generally called Hil- 
bert space, the first investigation of this space being due to Hilbert. 
In future we shall write H for the space for brevity. 

As above, we bring in the basic operations of multiplication of a 
vector by a number and addition of vectors for vectors of the space H. 
If we write the components of x as z,, we take the components of cx, 
where c is a complex number, as equal to cz,. If x and y, are 
the components of x and y, the components of the vector (x+y) 
are taken to be equal to (2, + y,). The difference x — y is the sum 


160 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [45 
of x and (—1)y (cf. [12]). Since series (235) is convergent, the series 
Py | czy |? is also convergent. Similarly, if 

| te)? and Sly? 

k=l k=1 


are convergent, it follows from what has been said above that 


WE 


| t+ Yx|* 


k=1 


is also convergent, i.e. the numerical sequences (cz, cr,,...) and 
(2, + 4%, 22+ 4, ---) define the vectors cx and x+ yin H,if xand y 
belong to H. The null vector is the vector, all the components of 
which are zero. It is simply denoted by the number 0 in vector 
equations. 

Operations on the vectors are subject to the usual rules (cf. [12]): 


xty=ytx (e+y)+z2=x+(y4+2); 
(a+ b)x=ax+bx; a(x+y)=—ax-+ay; a(bx) = (ab)x. 


Similarly, from what has been said, we can define the scalar product 
of two vectors x and y of the space: 


(x y) = SS 2Yu- 
k=1 
The sum 


(x x)= |e? (236) 
k=l 
defines the square of the length, or in other words, the norm of the 
vector x. We introduce the following notation for this: 


S| x =|IxI2- (237) 


The norm of any vector is positive, except in the case of the null 
vector, the norm of which is zero. Two vectors u and v are said to 
be orthogonal if their scalar product is zero, i.e. (u, v) = 0 or (v, u) = 0, 
one equation being a consequence of the other. Scalar products are 
subject to the same fundamental laws as in the case of a finite number 
of dimensions [13 and 30]. In particular we have the inequality 


l= y1< Illy, (238) 


45] INFINITE-DIMENSIONAL SPACE 161 
and the triangle rule follows exactly as in [30]: 
l=+yli<|[xl+llyll- (239) 


If the vectors x” (& = 1, 2,...,m) are orthogonal in pairs, i.e. 
(x, x) = 0 for i #7, we obviously have 
(x tee $x, xO 4 2. 4 xl) — (x, x) 2.4 (x x), 
or, what amounts to the same thing: 
[x9 2. +x P= xO YP + ||P, (240) 


i.e. the square of the norm of a sum of vectors that are orthogonal in 
pairs is equal to the sum of the squares of the norms of the terms. This 
proposition may be termed Pythagoras’ theorem. It follows at once 
from the definition of norm that, if c is a complex number, we have 
for the norm of cx: 


(k) 


hex || =[ ex] - ]xIl- 
If the vectors z,z@, ...,2 are orthogonal in pairs and the 
norm of each is equal to unity, i.e. 
(z™, 2) =0 for p#qQq, 
(2), z@) = 1, 
(240) gives 
[Jz +... Hex ||? =] ce, |? +... +]em |? 
where the ¢, are arbitrary complex numbers. 
The fundamental vectors in our space H have the components 
a®(1, 0, 0,-..); a@ (0, 1, O...);... 
The a“ are mutually orthogonal unit vectors. We can write the 
components 2, of the vector x as scalar products: 
Ly = (x, a) . 
We again consider an arbitrary system of m mutually orthogonal 
vectors of unit length 
z (k=1, 2,.-., m)- 
The scalar product (x, 2“) defines the magnitude of the projection 
of x on the aais 2. The 2 do not form a complete system of axes 
for the space H, and the sum 


> x) 


kml 


162 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [46 


in general differs from x. We can write our vector x as: 


m 
x= > (x,2%) 2 +0. (241) 
k=l 
On forming the scalar product on the right of both sides with 2 
and recalling that the z® are mutually orthogonal unit vectors, 
we get 
(x, z)) = (x, z)) + (u, zi), 
i.e. (u, 2°) = 0, or in other words, u is orthogonal to all the Zz, 
We can thus apply Pythagoras’ theorem to the sum (241): 


=P? = > | 2) P+ fa 
k=l 
whence the so-called Bessel inequality follows at once: 
m 
|| ||? > > {| (x, 2) [2. (242) 
k=1 


This can be stated as follows: the sum of the squares of the moduli 
of the projections of a vector on to any given mutually orthogonal unit 
vectors is not greater than the square of the length (norm) of the projected 
vector itself. We have the sign of equality in (242) when and only 
when the vector u in (241) is zero, i.e. its components are all zero. 


46, The convergence of vectors. We now explain the idea of the 


limit of a variable vector. Suppose we have a sequence of vectors v™, 


where & takes the values 1, 2,3, ... 


We denote the components of v™ by vo), of, ... We shall say 
that vectors v tend to the vector v in the limit if 
|v —y|| +0, ie. jiv—v@l2? + 0. (243) 
On writing v,,%, ... for the components of v, we can express 
condition (243) in the unabbreviated form 
lim [|v, — vf? + |e, —W P+ .--]=0. (244) 
kw 


A sum of positive terms tending to zero implies that each term 
tends to zero, ie. we have directly from (244): 


|%m —v|>0 as k->co (m=1, 2,...), (245) 


so that each component + must tend to the corresponding component 
Um, OF More precisely, the real and imaginary parts of v® must tend 


46] THE CONVERGENCE OF VECTORS 163 


to the real and imaginary parts of v,. We notice that the converse 
is not valid, i.e. condition (244) does not necessarily follow from (245). 
Suppose for instance that v has the components (0,..., 0,1, 0, ...), 
the unity being in the kth place. Each component becomes zero on 
indefinite increase of k, ie. we have v“->0 for any integral m, 
i.e. 0, = 0(m = 1, 2, ...), whereas the sum (244) remains throughout 
equal to unity. 

If the v™ sequence tends to v, we write v“ => v. We consider 
the following example of convergence. Let v(x, v,, ...) be a given 
vector and let vectors v™ be defined so that their first & components 
are the same as those of v whilst the remaining components are 
Zero, 1.€.: 


k) 


vO) (u,, Ug, --+5 Op 0, 0,...). 


It is easily shown that v =» y. For in the present case 


wv P— SF oak 
n=k+1 
and in view of the convergence of the series with the general term 
| Ym |?, the sum above tends to zero on indefinite increase of k. We 
observe some simple rules relating to limits. Ifa“ > uand y“ =v, we 
have 
on) +_yOo>n+v and (a, vy) - (u,v). 


It may be mentioned that the scalar product is a complex number, 
which is why we write — instead of = in the last expression. 
This last expression in fact shows the continuity of a scalar product. 
We have by (239): 


[ (a+ ¥) — (a + ve) || = |] (@ — a) + (v — v®) || < 
< [la — a || + | v— vO I], 


whilst by the definition of limit, || a—u™ || ~ 0 and || v— v™ || + 0. 
It follows from the inequality that 


|| (a + v) — (an + ¥) |] > 0, 


ie. in fact ua” + v >u-+v. Furthermore, it follows from the 
definition of limit that 


n®) =u + 8); wh =v + 1, 
where || 8 || > 0 and || t || > 0. We have for the scalar product: 


(a®, vO) = (a+ 2, v + 0) = (u,v) + (0, 2%) + (6, ¥) + (00, 1), 


164 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [46 


whence 
[(a, ¥) — (a, 9) | < | (a, £) | + | (6, v)| + | (0, ) |, 
or, by (238): 
|(u, ¥) — (no, ¥) | < [fa l]-f]e |] + [fs [|] ¥] + 8-2] 


The right-hand side tends to zero, so that also 


| (a, v) — (0, v)| +0 ie. (a, v) > (n,¥). 


In particular, (a®, a) > (u, n), ie. || a ||2—> || | [2 or |} a || > 
—> |lall- 

It is easily shown also that if the numerical sequence c, has the 
limit c, we have cu => cu. 

The necessary and sufficient condition for the existence of a limit 
is expressed as usual by Cauchy’s test. We shall state the test for 
a given case. Suppose we have the vector sequence 


v® .(£=1,2,...). (246) 


The necessary and sufficient condition for this sequence to have 
a limit is as follows: given any small positive ¢«, there exists an V 
such that 

|| vO — vo || <e, (247) 


provided only that 7 and m are > N. 
We first show that the condition is necessary. Let sequence (246) 
have the limit v. We can now write 


vin) — ym) — (x) — y) + (vy — vim), 
and therefore, by the triangle rule: 
> — ve] < J] — vf] 4] — 9 


It follows at once from the definition of limit that both terms on 
the right tend to zero on increase of n and m, so that the same must 
be true for the term on the left, i.e. condition (247) must in fact 
be satisfied. We now turn to the sufficiency of (247). We assume that 
(247) is fulfilled, and show that the sequency (246) tends to a limit. 
We can write (247) in the expanded form: 


D> |e — ow ? <2? for n and m>N, (248) 


s=l 


46] THE OONVERGENCE OF VECIOBS 165 


the components of v! being written oY. It follows at once that for 
any 8: 
|x —eM| <e for n and m>N 
or alternatively, on separating into the real and imaginary parts: 
ul) = aD + ip, 
we can write 
ja) —aM/ <e and |fM— AM | <e. 
We can say by applying the ordinary Cauchy test that a$ and 
£& have the limits a, and f,, and consequently v% tends to the 
complex number a, -+ ¢ fs. We call this limit v, and show first that 


the series > | v, |? is convergent, i.e. the v, are the components of a 
s=1 


vector. On retaining a finite number of first terms in sum (248) 
and passing to the limit as n + oo in this finite sum, we get 


M 
> le. — UMP < &, 
s=! 


where If is any integer. Passage to the limit with JJ — © in this 
last expression gives us 


S|, — wp <e, (249) 
s=1 


whence it follows at once that the numbers v, — vo” are the com- 
ponents of a vector. We already know that this is true as regards 
the numbers v{”, and we can therefore say that it is true for the 0,, 
i.e. the v, are the components of a vector v. We can thus write (249) as 

Ilv—v¥™ || <e, 
for m > N, ie. v¥™ =v, and sequence (246) in fact has a limit. 
Each component v, of the vector v is obviously defined as the limit 
of v&”, whence it follows at once that there can only be the one 
limit. We now consider the infinite vector sum 

o® + 0@ +... (250) 
It is said to be convergent if the sum of the first n terms: 

g(™ = uM) + inte + a™ 
has a limit in the above sense as n—> co. By Cauchy’s test, the 
necessary and sufficient condition for convergence is that 
[| sr) — a || = |[aD+...4 uC] <2, (251) 


for n> N and any p. 


166 LINEAR TRANSFORMATIONS AND QUADBATIO FORMS (46 


We have, on taking into account the continuity of scalar products: 
(x, 0® + o®@ 4+...) = (x, 0) + (x, u®) +... 
(a® + o® +...,x) = (a, x) + (a, x) +... 


On applying this to the case when the vectors u are mutually 
orthogonal, we have 


(a +0@+..., aD) + p@ +1. j= (a, a”) + (n®, a®)) Be 
or 


J $a. [fa [P+ oP +... 


i.e. Pythagoras’ theorem is also valid for the sum of an infinite set 
of mutually orthogonal vectors. 

We now establish the necessary and sufficient condition for the 
convergence of series (250) of mutually orthogonal vectors. In accord- 
ance with Cauchy’s test, we have to form expression (251) which 
is equal, by Pythagoras’ theorem, to 


ae PE. [fe 


Hence it follows at once that the necessary and sufficient condition 
for convergence of the series is the convergence of the series con- 
sisting of the squares of the norms of the vectors u“. This result 
can be expressed alternatively as follows: let x be mutually orthogonal 
unit vectors. We farm the series 


30x, (252) 
k=1 


where the C, are certain numbers, The necessary and sufficient con- 
dition for the convergence of this series is, by what we have proved 
above, the convergence of the series 


> | ox). 
k=1 


This implies among other things that changing the order of the 
terms in series (252) does not affect its convergence. It is easy to 
show also that the sum of (252) remains unchanged on changing the 
order of the terms. 


47, Complete systems of mutually orthogonal vectors. We now bring 
in an important concept, that of a complete system of mutually ortho- 
gonal vectors. We can show, as in the case of a finite number of dimen- 


47] COMPLETE SYSTEMS OF MUTUALLY ORTHOGONAL VECTORS 167 


sions, that every finite set of mutually orthogonal vectors is linearly 
independent. We saw in the case of n-dimensional space that a set 
of any 7 linearly independent vectors formed a complete system in 
the sense that any vector could be expressed as a linear combination 
of these 2 vectors. We no longer have such a simple criterion of com- 
pleteness in the case of an H space, since the number of dimensions 
is infinite. We shall only employ mutually orthogonal unit vectors 
in future. 

Suppose we have an infinite set of mutually orthogonal unit vectors 
x”) (k = 1, 2,...), and let y be a given vector of the space H. As 
in the case of a finite number of vectors, we form the sum of the 
projections of our vector on the axes: 


« 


> (y, x) x. (253) 
k=1 
As shown above, we have the inequality for any m: 


Mm 
xX ly. x)? <[ly|P 


and therefore in the limit 


> | (yx) F< |lyl?, (254) 
k= 


-_ 


so that the series on the left must be convergent. It now follows at 
once that series (253) must also be convergent. Suppose 


y= Dy. x) x +a. (255) 

k=l 
It may readily be shown as in [45] that the vector u is orthogonal 
to all the vectors x“, and consequently, by Pythagoras’ theorem: 


ly? = 1.x) P+ [ale (256) 
k=1 


Hence it follows that if the vector u in (255) differs from zero, we 
have the < sign in (254), whilst if u is zero (i.e. all its components 
vanish), we have the = sign in (254). 

The system of axes u“? is said to be complete if we have the = sign 
in (254) for any vector y of the H space. In this case, we can evidently 


168 LINBAR TRANSFORMATIONS AND QUADBATIO FORMS [47 


resolve any vector in terms of the complete system of fundamental 
vectors: 


y= 5-9) x), (257) 
k= 


A complete system is alternatively said to be closed, whilst the 
formula 


S|. P =lly IP (258) 
k=1 


is called the closure equation. We notice a consequence of (258), 
called the generalized closure equation. Let us have two vectors y 
and z, and let the x“ form a complete system, so that for any y and z: 


y= Sx) xO; a= Slax) x. (259) 
k=1 =1 


ty 


On applying (258) to the vectors y + z and y + %z, we get: 


> [(y, x) + (2, x)] [(y, x®)] + (2, 2)] =(y +zy+2) 
k=1 


> [(y, 2) + ¢(z, x)] [(y, x) — i (z,x)] = (y + tz, y + tz). 
f= 


We obtain, on using the closure equation for y and z: 


> (y, x) (2, x®) + > (z, x) (y, x) = (y,z) + (zy), 
kel k=1 


> (7,2) (x8) — ¥ (x) (7, =) = (y,z) — (2,9), 
kel k=1 


whence follows the generalized closure equation: 


>; x) (y, x) = (y, z). (260) 
k=1 
If x is the same as y, this equation becomes (258). 
We now consider in detail the fundamental vectors x™. Since these 
are mutually orthogonal unit vectors, we have for their components 
(*) = : 
xo (8 =, 2) es.) 
DSePZO =5,,, (261) 


s=1 


where 6,, = 0 for p# q and bpp = 1. 


47] OOMPLETE SYSTEMS OF MUTUALLY ORTHOGONAL VECTORS 169 


We now find the condition for completeness of the system x“). 
We bring in for this purpose the vectors y”, whose ith components 


are equal to unity and the rest zero. We have 
(y, x@) = x), 


and (258) gives 


Sle P=1 (= 1, 2:8; 25) 
k=l 
On now applying (260) to the vectors y” and y? for p# q, and 
using the fact that they are orthogonal, we obtain in addition to the 
above the following conditions: 


> 2h af? =0(p#9), 
k= 
1.e. in general, 


SS WO2® = 5,,. (262) 
k=1 


We write down the components of our vectors x“ in the form of an 
infinité matrix: 
2, 22, 2, ... 
2), 22), 2, ... (263) 


rr Y 


Equations (261), expressing the fact that the x“ are mutually 
orthogonal unit vectors, are equivalent to the fact that the columns 
of this matrix are normalized and orthogonal. Conditions (262) show 
that the rows must also be normalized and orthogonal for the system 
of x to be complete. 

We now show that conditions (262) with p = gq are likewise suffi- 
cient for completeness. In fact, if these conditions are satisfied with 
p = q, the closure equation applies for the vectors 


D 
y (0, -..,0,1,0,.--) 
(k) 


and all these can be expressed linearly in terms of the vectors x 


y — pe cD x) oe 
k=] 


We show that the same is true for any vector z. We denote by 2” 
the vector whose first / components are the same as for z, whilst the 


170 LINEAR TRANSFORMATIONS AND QUADRATIO FORMS [47 


remaining components are zero. We obviously have 

2 =z yY4...42z, 7%, 
and since the y“” are expressible linearly in terms of the x) the same 
can be said of 2: 


70 — sa x), 
k=1 


On forming the scalar product of both sides with x™, we get the 
following usual expressions for the coefficients d{?: 


d® — (2, xh). 


On the other hand, as we have seen above: 


z= S4,x+u, (264) 


k=l 


where u is orthogonal to all the x“, We now consider the difference 


oo 


z—20= S$ (d,— dp) x +a. 


k=! 


By Pythagoras’ theorem: 
|z— 20 [P= Slld.— a? |? + [lal 
k=1 


and consequently 
|= |? < ]2— 2 |p. 


The vector u does not depend on J, whilst we know from [46] that 
the right-hand side tends to zero as /—> oo. Hence it follows at once 
that u = 0, and (264) gives the resolution of any vector z in terms 
of the x“: 


z= 54,x [d,—x®-2]. (265) 
k=1 


Thus the closure equation is valid for any vector. The final 
result can be stated as follows. The necessary and sufficient condition 
for mutually orthogonal unit vectors x to form a complete (closed) 
system is that the sum of squares of the moduli of the elements of each 
row of mairix (263) is equal to unity. If this condition is satisfied for 
matrix (263), its rows are orthogonal. 


48) LINEAR TRANSFORMATIONS WITH AN INFINITE SET OF VARIABLES 171 


48. Linear transformations with an infinite set of variables. We shall 
consider in brief outline the linear transformation with an infinite set 
of variables: 

My = Oy) T+ Ayy%+ --- 
Ly = Ay, Ty + Ayn My --- 


ee 


(266) 


or 


x’ = Ax, (267) 


where A is the infinite matrix with elements a;,. We first of all lay 
down the condition that the infinite series on the right-hand sides of 
equations (266) are convergent for any vectorx of the space H. As we 
know, this condition is satisfied if the series 


>| Gx? (2 =1,2,..-) 
k=1 


are convergent for any 7. It can be shown that this condition is neces- 
sary as well as sufficient. If this condition is not satisfied, the series 
on the right-hand sides of equations (266) are convergent for only a 
part, and not the whole, of the space H. 

It is natural to lay down the further condition that if 2; is a vector 
component, the number 2; obtained as a result of transformation (266) 
also represents a vector component of the space H, i.e. the series 


oe 
> |%? 
£1 
taust be convergent if we have convergence of the series 
oo 
> |t/?- 


k=l 


If the matrix A satisfies the above two conditions, the corresponding 
transformation A is said to be bounded. The point of this term lies in 
the fact that we can prove the existence for such a transformation 
of a positive number I such that 


(|x' I? < [=I (268) 
or in the expanded form: 


S|xi <M Sx! (269) 
ksi k=1 


172 LINEAR TRANSFORMATIONS AND QUADRATIO FORMS [48 


We shall dwell on a particular case of a linear transformation. We take 
the transformation 


Lg = Uy Ly Ugg Tyr ---[? a) 


My = Uy Z + U2 %+.-- 
= 


the series 
> | 45. [* 
k=1 


being as usual assumed convergent for any 7. We bring into the discus- 
sion vectors u“ with components tq, %., ..., and suppose that the 
coefficients u;, are such that the vectors un form a complete system 
of mutually orthogonal unit vectors. As we have shown above, this is 
equivalent to the rows and columns of the matrix of uy, being ortho- 
gonal and normalized, i.e. 


2 tsp Usq = Ong» 2 Yes Ugs = Inq (271) 
s= s= 


The corresponding transformation (270) is said to be unitary in this 
case. 
We can write equations (270) as 
(x, 0) = of 
(x, n®) = 2! (272) 


The closure formula gives us 


2 o 


> 1% P= [xP = Slee? 
kel k=l 

i.e. as in the case of a finite number of dimensions, a unitary trans- 

formation does not change the length of a vector, and we can take 

M = 1 in expression (268). 

Equations (270) may readily be solved with respect to the x;, which 
gives us the inverse transformation to (270). On using the fact that 
the system u™ is complete, we obtain from equations (272) the follow- 
ing definitive expression for the vector x: , 


E=7ju) 4 2n@+... (2738) 
or 


a res 
Ly = Uyy B+ Ugg %+ .- 


48} LINBAR TRANSFORMATIONS WITH AN INFINITE SET OF VARIABLES 173 


In other words, if equations (270) have a solution, it must be ex- 
pressed. by (273) or (274). Of course, we are referring here only to the 
solutions 2, for which the sums of the squares of the moduli are con- 
vergent. We now show that equation (273) in fact yields the solution 
of the problem. The given numbers 2 are by hypothesis such that 
the squares of the moduli form a convergent series. Hence follows the 
convergence of series (273), as we know, since the u® are mutually 
orthogonal unit vectors. We have for the sum of this series; 


(x, a) = (af a® + x3 a) +..., 0%) =a}, 


i.e. the sum in fact satisfies system (270). System (274) shows that the 
inverse of the unitary transformation is obtained by replacing rows by 
columns and all the elements by their conjugates, i.e. we have here 
an entirely analogous case to that of a finite number of dimensions. 

In the general case even of bounded matrices, the problems of in- 
verse matrices and of reduction to the diagonal form present greater 
difficulty and lead to results that have no strict analogue in finite- 
dimensional space. A more detailed account of linear transformations 
by means of infinite matrices will be found in the fifth volume. We 
confine ourselves here to indicating a few results. We may mention 
the necessary and sufficient condition as regards the coefficients aj, 
for transformation (266) to be bounded. It is stated thus: there exists 
a positive number WV such that, with any positive integral & and any 
numbers gz; (s = 1, 2, ...), we have the inequality 


| > Gam EnFa| <N S| lee 
n, m=1 


The following simple sufficient condition may also be proved for bound- 
edness of the transformation (266): there exists a positive number 
i (not dependent on m or 7) such that we have the inequalities 


PAL m| <4; SO Rae 


serra 1,2,...) Pines 2, ...) 

If the matrix A defines a bounded ca ares there exists a 
unique matrix A such that, for any x and y: 
(Ax, y) = (x, Ay), 


the elements &, of A being given by @, = dj. If A is the same as A, 
i.e. aj, = dy, the bounded transformation (266) is called Hermitian 
or self-conjugate. 


174 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [49 


We have for bounded transformations: 


(4x, y) =P (Seon 2m) Yn = 3% (Sten Bn) = 


k ol 
= lim S > Sam Im Yn - 
2 koe m=in=1 

lou 


We notice the important particular case of a bounded operator, when 
we have convergence of the double series 


S |Gm|?. (275) 


nym=1 


In this case the double series 
> I%mImYn 


nm, m=1 
is absolutely convergent for any choice of vectors x(z,, 2, ...) and 
y(% Yo ---). Lf, m addition to the convergence of series (275), we have 
dix = Gz, we arrive at the possibility of reducing the Hermitian form 
to a sum of squares with the aid of a unitary transformation: 


D> Fam ImYn = ee % 
n,m=1 k=1 

where the vector 2(z,, 2),...) is obtained by application of a unitary 
transformation to the vector x(z,, x,,...): z= Ux. With this, 4,0 as 
k—» co. If A and B are infinite matrices, yielding bounded transfor- 
mations, successive application of them also yields a bounded trans- 
formation, the coefficients of which are given by the usual expressions 


{BA} = (Bh (Alec: 


We remark also that, if the vector sequence x“) has the limit x, i.e. 
xx, then Ax > Ax, if A is the matrix of a bounded trans- 
formation. 

Unbounded linear transformations also play an essential part in 
applications to mathematical physics. These are discussed in the fifth 
volume. 


49, Functional space. We have considered the space H in which 
a vector is defined by an infinite set of components, enumerated by 
means of integers: the first component being z,, the second z,, and 


49] FUNCTIONAL SPACE 175 


so on. We now turn to the functional space F in which the role of a 
vector is played by a function of one or more arguments which are 
capable of continuous variation. 

We consider a function f(z), defined in the interval a < z < 0. We 
can regard the function as a vector; for every value 2, of the above 
interval there is a corresponding number /(z,) which gives the com- 
ponent of the vector with the subscript z,. Here the independent 
variable z, which plays the part of component subscript, runs contin- 
uously through all the values in the interval a < z < 6, so that our 
vector {(x) has a continuous set of components. The value z, corre- 
sponds to the number of an axis in previous notations, whilst the value 
of the function /(z,) gives the magnitude of the corresponding compo- 
nent. We shall assume here that f(z) can take both real and complex 
values, whilst the interval of variation of the independent variable 
will always be taken to be a finite segment of the real axis. 

For the present we shall consider for the sake of definiteness the 
complex functions f(z) = /,(z) + if,(z), defined and continuous in 
the finite interval a < z < Bb. 

Such functions can be multiplied by complex numbers and added, 
as in the case of vectors of space H. This leads to further continuous 
functions. When defining the norm and scalar product we must replace 
summations everywhere by integrations. A scalar product is defined 


by 
5b —— 
(p(z), p(z)) = f g(x) p(x) dx (276) 


and the square of the norm by 
b 
i f(z) |? = (F@), Aa) = fi f(2) Pde. (277) 


Let the system of functions 9,(z) (k = 1, 2, ...) form a system of 
mutually orthogonal unit vectors, i.e. 


b _ 
§ PplX) Yq(x) dz = Sp, - (278) 


We have already mentioned such systems of normalized and orthog- 
onal functions [II, 148] and we confine ourselves here to recalling 
some results that have a direct connection with the above. The only 
new feature compared with [II, 148] is the fact that our present 
functions can also take complex values. 


176 LINBAB TRANSFORMATIONS AND QUADRATIO FORMS [49 


Suppose, then, that the g,(z) form an orthogonal normalized system 
and let f(x) be a given vector (or function). We bring into the discussion 
the Fourier coefficients of f(x) or, in our present terminology, the 
magnitudes of the projections of the vector f(z) on the axes of the. 
functional space represented by the functions ,(z): 


a = (2), eae) =f Ne) ul a (279) 


We consider the integral 


I,= §[he) — Sevena)l ae (280) 
a k= 


or 
& 


= § [f() — 4 g(a) [A2) — S% a, 7(2)] da . 


a 


We take into account equations (278) and (279) and arrive at the 
following expression for the integral: 
b 


= f[fe) pds — Sle? 


a 


whilst in view of the fact that I, > 0, we have 
n b 
D>lalP < fifa) Pde (281) 
k=1 a 
and in the limit, as n> ©: 
= b 
D> le? < Sle) Pde. (282) 
k=1 a 
This is known as Bessel’s inequality. 
If we have the = sign in the last expression (282), the integral I, 
tends to zero on indefinite increase of n and, conversely, if the integral 


tends to zero, we have the = sign in (282). 
If the = sign is obtained in (282), i.e. 


>a = fife Pda (283) 
k=1 a 


for any continuous function f(x), the system of functions p;,(z) is said to 
be complete or closed, whilst equation (283) is called the closure equation. 


49) FUNCTIONAL SPACE 177 


The integral J, here tends to zero for any continuous function 
f(x): 


b n 
lim f | #2) — > &, 9; (2) | dz=0, (284) 
na k=1 


ie. any such function can be represented to any required degree of 
accuracy by a linear combination of a finite number of p,(x), the phrase 
“to any required degree of accuracy” being understood to mean, not 
arbitrary smallness of the difference itself: 


fa) — Sa, p(x) |, 
k=l 


but arbitrary smallness of the integral I, for large n. Thus if we want 
to be more precise, we should speak of approximating to f(x) by a 
linear combination of a finite number of ¢,(z) to as small as desired 
@ root mean square error. 

Just as in [47], a generalized closure equation may be written for a 
complete system of functions ¢,(z). In fact, let a, and b, be the Fourier 
coefficients of the functions f(x) and f,(2): 


b b = 
a, = f(x) p,(z) da; b= § f,(z) o(2) de. (285) 
The following generalized closure equation is valid: 
co ss & = 
SaB.= SH) HA) az. (286) 
ke a 


Let the a; be the Fourier coefficients of f(x) as above. We form the 
Fourier series 


> By, Px(2) - 
k= 


We cannot say that this series is convergent, or still less that its 
sum is equal to f(z). The following notation is commonly used: 


f(z) ~ S & %x(2) (287) 

kal 
where the symbol ~ merely indicates that the infinite series on the 
right is the Fourier series for /(z). Though (287) is not an equation in 
the ordinary sense of the word, yet as we saw in [II, 148], if the p,({z) 


178 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS [50 


form a complete system, the expression becomes a, strict t equality on 


term-by-term integration ¢ of the right- -hand | d side, i.e. 


{ fa) de= > a, ( g(z)de (@a<y<2%,<b). 
X, k=1 x 


Prior to integration, we can multiply both sides of (287) by the 
continuous function (Zz), ie. 


F Aa) vila) dz = Sa, {" px(x) (a) de. 
x kel xX, 
Integration over the full interval (a, d) gives us 
oo & 
(a) vlz) dz — Seay f pu(2) ple) de. 
a k=l a 


It may easily be verified that this expression represents the genera]- 
ized. closure equation for the functions f(x) and y(z). 


50. The connection between functional and Hilbert space. We now 
undertake to establish the relationship, of great importance in theore- 
tical physics, between the functional space described in the previous 
section and the space H that we discussed earlier. 

Suppose we have the complete system of orthogonal, normalized 


functions 
¢x(Z) (K =1,2,...) (288) 


in a functional space, so that equation (283) holds for any continuous 


function /(z). We take a second continuous function f,(z) and suppose 
as above that 


& — 
by = [ fala) Gale) dx. 
On applying the closure formula to the difference /(z) — 7,(z), we get 
eo b 
> 1% — %)? = § | Ae) — Ale) P de- (289) 
k=1 a 


If the continuous functions f(z) and /,(x) differ, the right-hand side 
is certainly greater than zero, and consequently the coefficients 5, 
cannot all be the same as the a,, i.e. different continuous functions 
have different sets of Fourier coefficients with respect to system (288). 
Hence every continuous function is fully characterized by its Fourier 
coefficients, the squares of the moduli of which form on summation a 


50] THE CONNECTION BETWEEN FUNCTIONAL AND H SPACE 179 


convergent series, i.e. to every continuous function there corresponds a 
definite vector of the space H, the vectors corresponding to different con- 
tinuous functions being themselves different. Let f,(z) (n = 1,2, .-.) 


be a sequence of functions with Fourier coefficients a{”, i.e. 
ay = f Px (®) fr (2) Ae. (290) 
The closure equation gives 
Slo.— of = ff) fete Paz, (291) 


whence it follows at once that the convergence of a vector with com 
ponents a” (k = 1, 2, .:.) of the space H to a vector with components 
a; is equivalent to the equation in our functional space: 


6 
lim f'|f (2) — fa (2) Pz = 0. (292) 


If we take vectors with components 
Z(@1,@,---) and 2 (a,,..-,a,,0,0,...), 


z in the functional space corresponds to the part of the Fourier 
series of f(z): 


> dy Px (2). 
k=l 


We know from [45] that 2 —z which corresponds to the fact 
that the integral 

6 n 2 

§|F(@) — Say oy (@)| de 

a k=1 
tends to zero. 

As explained above, to every continuous function of our functional 
space there corresponds a definite vector of the space H. The converse 
statement does not hold, i.e. the vectors of the space H corresponding 
to continuous functions form only part of the space H. If we want the 
converse to be actually true, we have to consider some wider class of 
function than that of continuous functions; but this is a problem that 
we cannot dwell on here. 

We have established the correspondence between the functional space 
of continuous functions and the space H by taking a definite system of 
orthogonal functions (288) as our starting-point. If we introduce 


180 LINEAR TRANSFORMATIONS AND QUADBATIC FORMS [50 


instead of this the new system of orthogonal and normalized functions 
Pr (x) (k = 1, 2, see ) (293) 


the law of correspondence is naturally not the same. It can be shown 
that the vectors of space H corresponding to these latter func- 
tions must be subject to a unitary transformation. Of course system 
(293) must also be complete in this case. 

Definite Fourier coefficients with respect to system (288), or in 
other words, a definite Fourier series, correspond to every function 
Ym(x) of system (293). 

We thus have the following array: 


Pm (x) ~ > Crm Px (2) « 
k=1 


The sign ~ merely shows that the function on the left corresponds 
to the Fourier series on the right. On taking into account the fact 
that the functions »,(z) are normalized, and the closure formula 
(283), we have 

k=] 
In addition, the generalized closure equation holds: 


oo o_O 
= txp Upn = S ¥p (x) Pq (x) dz, 
k=1 a . 
which gives us, by the orthogonality of the sp,(z) and (294): 
> ekp Uig = Opq- (295) 
k=1 


This expression shows us that the matrix U with elements uy, satis- 
fies the condition for its columns to be normalized and orthogonal. 
We can show by making use of the results of [48] that the necessary 
and sufficient condition for the system of functions (293) to be com- 
plete is that the sum of the squares of the moduli of the elements of 
each row of matrix U be equal to unity, i.e. 


SuP=—1  (=1,2,...). (296) 
k=1 


All the above refers to the case when the functional space is 
made up of functions of a single independent variable. We can also 


51] LINEAR PUNOTIONAL OPERATORS 18] 


consider functions of several independent variables, defined in some 
domain of space of more than one dimension. All our discussion re- 
mains in force, the only difference being that the single integrals are 
everywhere replaced by multiple integrals over the domain in which 
our functions are defined. 

We may take as an example the system of functions 


=e (=0,41,+2,... 297 

Pr (x) Vox é ( > E > ) ( ) 

and let the fundamental interval be (~-z, -+-z). Functions (297) are 

readily seen to form an orthogonal and normalized system. For, if 
DFG: 


1 [ei G-P)x]z=2 — 9 
Pp = 2 


az 1 zz 
ia (z) Pq (2) dz = = f ef GP) dy = Qaiq@—P) 


Jleerar=z faz. 


We can show by using the results obtained earlier for Fourier ex- 
pansions that system (297) is likewise complete in the interval in 
question. 


51. Linear functional operators. It is possible to establish for 
functional space a concept corresponding to that of linear transfor- 
mation for the space H. This leads us to linear functional operators. 
Suppose we have a definite rule by which there corresponds to any 
function f(x) (with definite properties) another function F(z): 


F(z) =Lif(2)), (298) 


L being the symbolic notation for the correspondence rule. Here we 
have as it were a generalized concept of function. The role of argument 
is played, not by a variable number, but by a function f(z) which 
can be chosen arbitrarily from a certain class, whilst again, the value 
of the function, insteadof being anumber, is a new function F(z).Sucha 
generalized functional relationship is usually described as a functional 
operator or functional. The idea of functional operator is present in a 
latent form in a number of problems of mathematical physics. We 
may take, for instance, the problem of the vibrations of a string fixed 


182 LINEAR TRANSFORMATIONS AND QUADBRATIO PORMS (51 


at the ends. The graph of the string at a given instant ¢ is defined by 
the two graphs of the initial conditions, i.e. by the graph of the initial 
displacement and by the graph of initial velocity, so that we are evi- 
dently concerned here with a functional operation. The same type of 
situation is obtained in many other problems of mathematical physics. 
Sometimes the role of argument is played, not by the graph of an 
initial distribution, but say by the contour of the domain to which 
the problem relates. 
The operator L is said to be linear if we have 


Lif (2) + f(z] =LE, (z=) + LE (z)] (299) 
L {ef (x)] = cL [f (x)], 


where c is a constant. 
The condition for the transformation to be bounded is of the form 


ZF (z)1 || < Ae F(z) I. (299) 


where Jf is a positive constant and f(x) is any function of the functional 
space. 

We shall not concern ourselves here with the general theory of 
linear operators but confine the discussion to a few special examples 
that explain the main essence of the concept; its connection with 
linear transformations of space H will also be mentioned, since we have 
established the correspondence between functional space and space H. 

In certain cases a linear functional operator may be written as 


and 


F(a) =[K(a, f(t at, (300) 


where K(z, y) is @ given function of two variables which is usually 
known as the kernel (or nucleus) of the operator. In the present case 
the kernel is the complete analogue of the array of aj, of a linear trans- 
formation of the space H. Instead of subscripts i and k, we have here 
the two variables x and y which take a continuous sequence of values, 
and expression (300) is entirely analogous to (266). We shall investi- 
gate operators of type (300) in detail when considering integral 
equations. 

Unitary and Hermitian operators may readily be defined for ; the 
present case. The linear functional operator Z is said to be unitary 
if we have, for any two functions /(z) and g(x) of a given class: 


(Lf (2), Le (x)) = (f (2), 9 (2))- (301) 


51] ENTRAR FUNCTIONAL OPERATORS * 183 
The Hermitian operator L, is defined by 
(Ly 7 (2), 9 (x)) = (f (2), Li 9 (2))- (302) 
Let £, have the form (300), ive. 


b 
= { K (2,1) f(t) de 
We form the scalar products appearing in (302): 
bo oe 
(f (x), Lg (x) = { § K (a, 8) f (x) p (dtd, 


(L, f (2); 9 (2) = fe t) f(t) p (x) dt de. 
aa 
On changing the notation for the variables of integration in the last 
integral, we can write (302) as follows: 


(Fes [K (a, t) — K (¢, z)]f (x) p() dtdz = 0. (303) 


BOS 


If the kernel of the operator satisfies the relationship 
K(z,t)= Kit, 2), (304) 


condition (303) is satisfied for any choice of functions f(x) and (2), 
and the operator Z, is Hermitian in this case. On taking into account 
the arbitrariness of the above-mentioned f(z) and g(x), we can say 
that equation (304) is not only sufficient but also necessary for con- 
dition (303) to be fulfilled, ie. for the operator L, to be Hermitian. 
If the kernel K(z, t) is a real function, condition (304) can be written as 


K (gz, t)= K(t, 2), (305) 


i.e. the kernel must in this case be a symmetric function of its 


arguments. 
We consider a few further examples of linear operators. We take as 
our first example the operator consisting of differentiation followed by 


multiplication by L/z: 


Lf (2) =~ ==¢ (m), (306) 


184 LENEAR TRANSFORMATIONS AND QUADRATIC FORMS [51 


where (—z, ++) is taken as the basic interval. We form the scalar 
product for operator (306): 


(f(a), Le (2) = (F(@), =e’ (@) => [roy Bee. 


On assuming periodic functions of period 2% and integrating by 
parts, we have 


(Fe). 9" (@)) = ee +1 frimem ae 


whence the equation follows at once: 
(fe). +e @)=(Fr@, ow), (307) 


i.e. (306) is an Hermitian operator with respect to the class of differen- 
tiable periodic functions. 

We select the system of functions (297) as the coordinate system 
in functional space. The function f(z) is now characterized by its 
Fourier coefficients a,, which are given by 


a, = = joes (x) dz. (308) 


—_— 


We shall have different Fourier coefficients aj, for the function (1/¢)/’(z). 
We can readily establish a linear transformation giving the aj, in terms 
of the a,. This linear transformation will express the functional ope- 
rator (306) in the form of an infinite matrix; at the same time, 
it must be borne in mind that this expression for operator (306) will be 
referred to a definite choice of coordinate axes in the functional space, 
namely the coordinate functions given by (297). We have: 


= fe ee ade, 


whence, on integrating by parts and assuming f(x) periodic, we find: 


ai, = Fa Jonna 


i.e. 


a,=—ka, (k=0,+1,+2,...). (309) 


51] LINEAR PUNCTIONAL OPERATORS 185 


This equation in fact expresses the linear transformation concerned. 
Its matrix has the form 


! | 
Pecos Do OD Oy 0; eke | 
| - 0,-1,0,0,0,...| 
. 0, O,0,0,0,... |I° (310) 
| 5 802. 0220; 0, we 
i... 0, 0,0,0,2,... 


i.e. the matrix is seen to be diagonal. The fact that the rows and 
columns of (310) are numbered from —° to +o instead of from 1 to 
oo is a new point of no real significance. Functions (297) are numbered 
in the same manner. We remark that these functions satisfy the ob- 
vious relationship 


> Pi (x) = ke, (2), 


i.e. on writing Z for operator (306): 
Lp, (%) = key (x). (311) 


By analogy with [37], we can call the 9,(x) the eigenfunctions of the 
operator I and the k the corresponding eigenvalues. The diagonal form 
of matrix (310) is directly bound up with the fact that the ¢,(x) are 
the eigenfunctions of operator (306). 

We take as a second example the operator consisting of multiplica- 
tion by the independent variable: 


L, [f (x)] = xf (2). (312) 

We find the linear transformation expressing this operator in the 

space H, when we take (297) as coordinate functions in the functional 

space. Let a, be the Fourier cocfficients of /(z) as above, and a; the 
Fourier coefficients of zf(z), i.e. 

f eri af (x) de (m=0,-+1,...). (318) 


— 


a, J 


We want to find the linear transformation giving a;, in terms of am. 
To evaluate integral (313), we find the Fourier coefficients of 
(1/¥27) e™*a: 2 
Cy = = [ ef (m—hx ye dz. 


— 


186 LINEAR TRANSFORMATIONS AND QUADRATIC FORMS (61 
Integration by parts gives us, for m — k # 0: 


—k 
1 ef (mK __ (— 1)" 


“k= Tim —&) i(m—f) ° 


We now find the coefficient ¢,,: 


We re-write (313) as 


and use the generalized closure equation (286), the Fourier coefficients 
of f(z) being given by a; and the Fourier coefficients of (1/22) e'™* x 
by the above expressions. We obtain for am: 


Am = +t PA es ee (314) 


the prime on the summation sign indicates that terms corresponding 
to k = m must be excluded. Expression (314) in fact gives the linear 
transformation of space H corresponding to operator (312), if (297) are 
taken as the coordinate functions in the functional space. 

In general, suppose we take as coordinate functions some complete 
system of orthogonal and normalized functions 


P(X), Po (%),... 


If Z is a lmear Hermitian operator, where 


Hg; (x) ~ > BigP, (X), 
kal 
we have aj, = Qj. Let 


y(z)~ > CEP, (2) 
k=1 


give the Fourier series of a function p(z). For the function Hy(z), 
we have the new Fourier series 


Hey (a1) ~ > ciy (a) 


where it can be shown that 


51) LINEAR FUNCTIONAL OPERATORS 187 


This linear transformation in fact expresses the operator H if the 
gx(z) are taken as coordinate functions. 

We return to the differentiation operator (306). Even if we only 
consider continuous functions, this operator cannot be applied to all of 
them, since continuous functions exist that lack a derivative for every 
value of z. The linear transformation (309) of space H corresponds to 


eo 
operator (306). If the series > | a; |? is convergent, the series 


ay 


= 


—s 


+2 
r= De la,/ 


may conceivably be divergent. This shows that transformation (309) 
is not applicable to the whole of the space H, which is in accordance 
with what we said above. 


CHAPTER Iti 


THE BASIC THEORY OF GROUPS 
AND LINEAR REPRESENTATIONS 
OF GROUPS 


52. Groups of linear transformations. We consider the set of all 
unitary transformations in n-dimensional space. All these transfor- 
mations have a non-zero determinant, so that for any unitary trans- 
formation Ux which is completely characterized by its matrix 
U there is a fully defined inverse transformation U-1x which 
is also unitary [28]. Furthermore, if U,x and U,x are two 
unitary transformations, their product U,U,x is also unitary. All 
these properties of the set of all unitary transformations may be 
bricfly expressed by saying that the set of unitary transformations 
forms a group. 

A set of linear transformations with non-zero determinants in 
general forms a group if the following two conditions are fulfilled: 
firstly, the inverse of any transformation belonging to the set also belongs 
to the set, and secondly, the product in any order of two transformations 
belonging to the set also belongs to the set, the transformations multiplied 
being possibly identical. 

Bearing in mind that the product of any transformation with its 
inverse is the identity transformation, we can say that a group must 
contain the identity transformation, i.e. the unit matrix. 

Since a linear transformation is always fully defined by its matrix, 
it is immaterial whether, in the above or in what follows, we speak 
of groups of linear transformations or groups of matrices. 

Further examples may be given of groups of linear transformations. 
The set of all real orthogonal transformations may easily be seen to 
form a group. We know that real orthogonal transformations have 
determinants equal to (+1). If we take the set of real orthogonal 
transformations with (+1) determinants, we also get a group. The set 
of real orthogonal transformations with (—1) determinants do not 


188 


52] GEOUPS OF LINEAR TRANSFORMATIONS 189 


form a group, however, since the product of two matrices with (—1) 
determinants yields a matrix with a (+1) determinant. 

In particular, if we take the group of real orthogonal transformations 
in three variables, this consists of pure rotations of space about the 
origin, and of the transformations resulting from such a rotation 
together with a symmetry transformation with respect to the origin. 
Whereas if we take the group of linear orthogonal transformations 
in three variables with (-+-1) determinants, we get the group of 
rotations of space about the origin. 

All the groups mentioned have contained an infinite set of trans- 
formations; in particular, the group of rotations of three-dimensional 
space about the origin depends on three arbitrary real parameters, i.e. 
the Euler’s angles that we discussed above. 

We take as an example the rotation of space about the 2 axis by an 
angle g, the expressions for which are 


a’ =x cosy — ysing (1) 
y =xsing+ ycos g. 


If the real parameter gp takes all values in the interval (0, 2), we 
obviously get a group containing an infinite set of transformations 
and depending on a single real parameter. We bring in the following 
notation for the matrix of the transformation: 
cosp, — sing! 


Z,= 


. (2) 


sing,  cosp 


It is immediately clear that the product of two rotations by the 
angles g, and ¢, yields a rotation by the angle (gy, + ¢,): 


and similarly, 
Lola. = Z 


gitga* 


This shows us that here all the transformations, or as we say, all the 
elements of the group commute in pairs. Such a group is termed 
Abelian. In the last example, moreover, multiplication of two elements 
amounts simply to addition of the two corresponding parameters. 

We can somewhat extend the last example by taking optical reflec- 
tion in the y axis as well as rotation of the zy plane about the origin. 
It is clearly immaterial in what order these operations are carried out, 
i.e. we can first rotate about the origin then reflect symmetrically in 
the y axis, or vice versa. Changing the order affects the result, but the 


190 ‘THE BASIO THEORY OF GROUPS AND LINEAR REPEESENTATIONS OF GRoUrs [52 


total set of transformations is the same in both cases. The set consists 
of real orthogonal! transformations in two variables. The matrix has 


the general form 
dcosy, —dsin 
(.2}=| ii ‘I. (4) 
sin 9, cos 9 


where g is the previous parameter and d is a number equal to +1. 
With d = 1 we get rotation of the zy plane about the origin, whilst 
with d = — 1 we get rotation followed by the reflection. The following 
tule is readily derived for multiplication of matrices (4): 


{edz} {ed} ={ 91+ dp, d,d, }. (5) 

The product can now depend on the order of the factors, i.e. the 
present group is not Abelian. Similarly, the group of real orthogonal 
transformations in three-dimensional space is clearly not Abelian, 
and the same can be said even of the group of rotations of three- 
dimensional space. 

The examples so far have been of groups containing an infinite set 
of transformations (elements), the corresponding matrices having con- 
tained arbitrary real parameters. We now mention some examples of 
groups containing a finite number of elements. Let m be a given 
positive integer. We consider the set of rotations of the zy plane 
about the origin by the angles 


0,22, 4 2(m—I)z 
!™’? mm’ te “Geet 


so that we have altogether m transformations, with matrices 


2hx 2 | 
Z OS. 
a = = — 
ae oka ona || (#=% L--., m—1). 
1] Sin ——, cos 
i ‘TN ™ 


These transformations clearly form a group, the elements of which 
are positive integral powers of a single transformation, i.e. 


Zin = (Zan) (k=0,1,..., m—1). (6) 


A finite group made up of powers of a transformation is usually 
described as cyclical. 
If we take an angle’ ,, not a multiple of z, the transformations 
(matrices) 
Zk = Zig (4 =OF1,42,...) (7) 


0 


53] GROUPS OF REGULAR POLYHEDRA 191 


clearly also form a group. But we now have a group with an infinite 
set of elements, since there is no integral power for which Ze, 
coincides with Ze, =I. Group (7) is infinite, yet its matrices do not 
contain a continuously varying parameter. We say in this case that 
the elements in the group are enumerable, i.e. we can provide every 
element with an integral subscript, in such a way that different sub- 
scripts correspond to different elements, and every integer is the sub- 
script of an element. We cannot do this in the case of groups containing 
continuously varying parameters. 


53. Groups of regular polyhedra. Finite groups may also be formed 
by rotation of three-dimensional space about the origin. We know that 
these rotations are expressed, in a given coordinate system, by linear 
transformations of the coordinates. It must be pointed out that when 
we speak of a rotation of space about the origin, we understand simply 
the final effect of passage from the initial to the transformed position. 
How the passage is achieved is completely immaterial for our discus- 
sion. In fact, any linear transformation defines the coordinates of 
transformed points but naturally says nothing about the actual 
mechanism of the transformation, so that this latter plays no part at 
all in our arguments. 

We take a sphere with centre at the origin and unit radius. 
We inscribe a regular polyhedron in the sphere, say the octahedron 
of Fig. 2. The surface of this polyhedron 
is known to consist of eight equilateral 
triangles. We consider the set of rotations 
of three-dimensional space about the origin 
such that the octahedron is transformed 
into itself. This set may easily be seen to 
form a group which contains a finite num- 
ber of elements. Let us find the number 
of elements. We take any axis joining two 
opposite vertices of the octahedron. The 
octahedron is transformed into itself if we 
rotate the space about this axis by angles 
of 0, 2/2, x, 32/2. Rotation by the angle 
0 is evidently the identity transformation, . 

i.e. corresponds to the unit matrix. We shall write our four rotations 
about this axis as 


S, = J, 8,, 8s, Ss. (8) 


192 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OP GROUPS [53 


Let A be a vertex of the octahedron on our axis. We introduce in 
the five linear transformations 


Py, Tz, Ty, Ty Ts, 


transforming the octahedron into itself and such that A coincides with 
one of the remaining five vertices. Along with the four rotations (8) 
we compose a further 20 rotations of space about the origin as 
follows: 


TS, T,8,, TS. TS,  (k=1, 2, 8, 4, 5). (9) 


The 24 rotations (8) and (9) are easily seen to be distinct. This is 
completely obvious geometrically, whilst it can also be shown as 
follows: let 

T S,=TpSq- (10) 

The transformations S; correspond to rotations about an axis pass- 
ing through the vertex A, so that they do not alter the position of A. 
Transformations 7’, and 7',,, for different subscripts p and p,, shift A 
to different vertices, and it therefore follows from (10) that p and p, 
are the same; but now it follows from the same equation, after multi- 
plying on the left by 75'=T7';', that g and gq, are the same, i.e. (10) 
is only valid when the left and right-hand sides consist of the same 
factors. We thus have 24 distinct rotations from (8) and (9), for which 
the octahedron is displaced into itself. We now show that these are 
all the rotations having this property. Let V be a rotation transforming 
the octahedron into itself. Suppose that, with this, A is displaced to 
another vertex A;, and let 7; be the transformation 7, which also 
shifts A to A;. We compose the transformation ie V. With this, the 
octahedron transforms into itself and A remains in position. The 
opposite vertex consequently also remains in position, so that the 
transformation we have composed is one of the rotations 8; about the 
axis through A, ie. T7)1V = S;, whence V = T;S;. In other words, 
any transformation shifting the octahedron into itself must be in- 
cluded in the 24 rotations composed above. Or finally, the group of 
rotations transforming an octahedron into itself consists of 24 elements. 

We can evidently inscribe a cube in the unit sphere such that the 
radii passing through the centres of the faces of the octahedron end 
in the vertices of the cube. It follows directly from this that the group 
of rotations is the same for a cube as for as octahedron. Suppose we 
take a new position of the octahedron, obtained from the original 
position with the aid of a rotation having the matrix U. If V is a rota- 


53) GROUPS OF LINEAR TRANSFORMATIONS 193 


tion that displaces the original octahedron into itself, VVU~- clearly 
yields a rotation that displaces the new octahedron into itself, and 
conversely. Thus if the group of rotations of the original octahedron 
consists of matrices V, (k = 1, 2, ..., 24), the rotation group for 
the new octahedron simply consists of the similar matrices UV, U-1. 
In other words, we obtain a similar group. In general, ¢f a set of matrices 
V, forms a group, the set of similar matrices UV, U-, with any fixed 
U, also forms a group. We leave to the reader the proof, that 
easily follows directly from the definition of group. The second group 
is usually described as similar to the first. 

We now consider the tetrahedron, having four vertices and a surface 
consisting of four equilateral triangles. We take any axis joining a 
vertex A to the centre of the opposite face. The tetrahedron is trans- 
formed into itself if we rotate the space in a given direction about 
the axis by angles of 0, 27/3, 42/3. Let these rotations be S,, 9,, 8,. 
We also introduce the three linear transformations T,, T,, 7; by 
which the tetrahedron is displaced into itself, the vertex A being 
brought to coincide with one of the three remaining vertices. In addi- 
tion to S,, S,, S,, we compose the nine rotations 7; Sy, T, Sy, Ty 8, 
(k = 1, 2, 3) which gives us altogether 12 rotations that are distinct 
and that represent all the rotations transforming the tetrahedron 
into itself. 

We now take the icosahedron whose surface consists of twenty 
equilateral triangles and whose vertices are twelve in number. As 
above, we take any axis joining a vertex A to the opposite vertex. 
The icosahedron is displaced into itself by rotating the space by angles 
of 2kx/5 (k = 0, 1, 2, 3, 4). Let these rotations be S,. We have further 
eleven rotations 7; (f = 1, 2, ..., 11) for which the vertex A becomes 
one of the remaining vertices and the icosahedron is displaced into 
itself. The total group of rotations transforming the icosahedron into 
itself consists of the five rotations S, and 55 rotations 7 S,. Thus 
the group contains altogether 60 rotations. The same group is obtained 
for a dodecahedron, with twenty vertices and twelve regular pentagonal 
faces. This can be seen by arranging the dodecahedron with respect 
to the icosahedron in a similar manner to that used above for arranging 
a cube with respect to an octahedron. 

We consider one further group of rotations of three-dimensional 
space. Suppose we have a regular n-sided polygon in the zy plane, 
with its centre at the origin. We take an axis joining a vertex A to 
the opposite vertex (if nm is even), or to the middle of the opposite 


194 THE BASIC THEORY OF GROUPS AND LINEAR REFRESENTATIONS OF GRours [54 


side {if n is odd). The polygon is displaced into itself by rotation 
about the axis by angles of 0 and x. The first rotation is the identity 
transformation J, whilst we write S for the second. 

In addition, the rotations 7, about the z axis by angles of 2kx/n 
(% = 1, 2, ...,2 — 1) displace the vertex A to another vertex and 
transform the polygon into itself. We have the identity transforma- 
tion 7, = I with k = 0. The total group of transformations displacing 
the polygon into itself contains the following 2n elements: T;, and 
T,S (k=0,1,2,.-..,n — 1). 

The above nz-sided polygon, whose surface is taken twice (top and 
bottom), is usually termed a dihedron, the corresponding group being 
the dihedral group. 


54, Lorentz transformations. All the above examples of groups of 
linear transformations have consisted of unitary transformations or 
of rotations of three-dimensiona] space (a particular case of unitary 
transformation). We now investigate a new group of linear trans- 
formations where the elements are not unitary. This group has an 
important role in relativity, electrodynamics and relativistic quantum 
mechanics. 

We take four variables 2,, 2, 1, ©,, the first three being the spatial 
coordinates of a point and the last being time. The fundamental 
requirement of the special theory of relativity for the invariance of 
a certain definite velocity c (the speed of light) in the case of relative 
motion leads to the following problem: for what linear transforma- 
tions of the above four variables is the expression 


aj taht ag—ctal 


invariant? To be more explicit, we want to find the linear trans- 
formations giving the new variables 2; in terms of the original z, 
such that we have the identity 


ai? aie fat — chal? = ah + 2h + ah — cal. 


We first take the case when the coordinates z, and z, remain 
unchanged, so that x, and z, are the only variables in the linear 
transformation. We thus have to find the transformations 


Hy yy Ey + yg, Ly = Aygh, + AyXy (11) 
such that 


2 Dal D 


xy? — ag? = xt — ch. (12) 


54] LORENTZ TRANSFORMATIONS 195 
We replace z, by a new pure imaginary variable given by 
Y = tc,. 


The required linear transformations must have the form 


o f 
y= Oy FZ + Oy, Y= AZ + Apo}, (13) 
where 
= ‘ a Fie a, ‘ =e 
My HAs Ay Zor Far = Ags Fag = Oy, 


whilst condition (12) may be re-written as 
ae typ=att yi. (14) 


The coefficients a,, and a, must be real, and a,, and a,, pure imaginary. 
We therefore write a,, = if,. and ay = 784. Condition (14) is evi- 
dently equivalent to requiring orthogonality of transformations (13), 
and we can write that the sum of the squares of the elements of each 
row and column must be equal to unity. It is easily verified that this 
gives us Bi = By = %, —1—=a.—1 and aj, =a3,. Let a, =a 
and £,. = af. We shall take a,, and a, positive, which corresponds 
to invariance of the direction of measuring z, and z,. Thus instead 
of (13), we have by the above relationships: 


Fé ba v 
x= ar, + py, Yj, = 4%, + ay. 


The condition for orthogonality of rows, 


ad,, + ia*B = 0, 
gives us a,, = —ia, ie. B,. and £,, must have opposite signs. Finally 
the condition 
aj, + aj, = 1 
gives us 
1 
a? — a2f2 — 1, = ——— (B2< 1), 
p OP) 
and we arrive at the following expressions: 
a +thy, —tB ath 


Ye > n= a 3 =z 
or, on again returning from y, = tcz, to the original ~,: 
— £ %,+2, 


yl—-# 


r_ & — Bea, | , 
oat t= 


Vl ’ 


196 THE RASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [54 


It follows directly from these equations that the coordinate system 
corresponding to the primed variables moves with respect to the 
original coordinate system with a velocity 


v = Be (16) 
in the direction of the z, axis. For if we take 2; constant, we get 
dz. 
dz, — pedx,=—0, ie. += fe. 
1 Be 4 Le dz, Bc 


On replacing f by the velocity v in accordance with (16), 2, by 
a, and x, by t, we get the usual form of the Lorentz transformation 
in two variables: 


(17) 


Ee es as 
7 - ve 
ns "Ps 


In the limit as ce oo, we get the ordinary expressions for relative 
motion in classical mechanics: 


v=x—vt; vt. 

The Lorentz transformations (17), depending on the single real 
parameter v, may easily be seen to form a group. On solving equations 
(17) with respect to x and é, we obtain the inverse transformation to 
(17). This is the Lorentz transformation obtained by replacing v by 
(— v) in (17). For we have by solving (17): 


(a —H)e=fi — 2 (x! — wt); (1 ~a)t= i —~2(Ge +e), 


whence it follows that 


We now consider the Lorentz transformations L, and L, correspond- 
ing to the parametric values » = v, and v = v,. We form their product 
££, and show that this is also a Lorentz transformation. We have 
to form the product of two matrices: 


| 1 _ _ Bre 1 __ fe}! 
yi-#’ so V1-#B Yi-fR’ VI— BR | 
By by rE 
fees! c ; 1 | — c ; I | 
| U-B i= a | | Vi-#’ vI-# || 


54] LORENTZ TRANSFORMATIONS 197 


where 


The usual multiplication rules give us the following matrix product 


1 _ B, e + Bs 4 
__1+6, be BBs PEEP . (18) 
Vi-BvI—A| oe Fe 1 
1+ BB,’ 
We introduce the new quantity 
oy = te (19) 
1+ i“2 
e 
We can easily verify the identity 
U1. 
aes 1 


yf if ar i 
ji-3 1-4 ja-4 


as a result of which matrix (18) can be written aa follows: 


1 Bye 
yi-#’ = Y¥i-# 
A&A | (=) 


~ Wi-fi’ Vi-Fi 


i.e. it in fact corresponds to the Lorentz transformation for v = 0. 
Expression (19) thus gives the rule for adding velocities in the special 
theory of relativity. If we set v, = c in (19), it may readily be seen that 
we also get v, = ¢ for the resultant velocity, i.e. the velocity c is in 
fact unchanged when two motions are superimposed. 

When deriving (15), we fixed the signs of the coefficients of linear 
transformation (11) in a definite manner, i.e. we assumed a,, and 
@,, positive. An alternative requirement is positiveness of the coef- 
ficient a,, and of the determinant 


41 Bq; — Ayq @qy- (20) 


Positiveness of a,, is easily seen to follow as a consequence of 
this, and vice versa. For the determinant of transformation (17) is 
equal to (-++1), ie. with a,, > 0, (20) is also positive. If we were to 


198 THE RASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [54 


take a,, = —a and a,, =a, where a > 0, we should have a trans- 
formation with a (—1) determinant. The condition that a,, is positive 
is equivalent to the fact that we have 24> © with fixed 2, and 
2,—> ce. This can be said to correspond to invariance in the direction 
for measuring time. Thus the formulae do not give all the linear 
transformations satisfying condition (12), but only those for which 
determinant (20) is positive and which do not vary the direction for 
measuring time. 
We now return to the general Lorentz transformation for four vari- 
ables 2, (k = 1, 2, 3, 4), where we must satisfy the condition 
ait 4 af + oi? — Pa? = at + ah + 8 — cP ad. (21) 
We shall take z, (k = 1, 2,3) and a (kK = 1, 2,3) as Cartesian 
coordinates in two different three-dimensional spaces R and R’. 
We show that, by suitable choice of coordinate axes in the two spaces, 
the general Lorentz transformation can be reduced to the particular 
case discussed above. Let Z' denote the general and S the particular 
Lorentz transformation. Our assertion is equivalent to the fact 


that we can write 7 as 
T =VSU, (22) 


where U and V are real orthogonal transformations corresponding to 


the above-mentioned coordinate transformations in spaces R and R’. 
We bring in four new variables as above: 


Yy = X35 Yo = Lq3 Ya = La; Yg = UT, 
and similarly 
Yi = Ty Yz = 22} Yg = Ts Y4 = UT. 
We obtain for the new variables, instead of (21), the ordinary condi- 
tion for an orthogonal transformation: 


Yo + ye ys + ye =yit w+ + H- (23) 
The required linear transformation will be of the form 
Ye = Ug Yy Hb Apa Yo + Mg Y3 + Ca Ye (K = 1, 2,3, 4). (24) 


Observing that y, and 4 must be pure imaginary, we can say that 
coefficients a1, Ox2, @,3, With * = 1, 2,3, and also a,, must be real, 
whilst a4, G49, G43, and a,4 with k = 1, 2,3 must be pure imaginary. 
A change of coordinate axes in the space R’ is equivalent to a real 
orthogonal transformation on the variables yj, y2, y3. We consider 
the coefficients 

Ong = TB g; Ong = IB og ay = Bay. 


54] LORENTZ TRANSPOBMATIONS 199 


The real numbers f,,, Bo, Bs, define a: certain vector; if we take 
the direction of this as the new first axis in the space R’, the coef- 
ficients a,, and a,, vanish as a result of the corresponding orthogonal 
transformation. To see this, we only need to notice that by (24), an 
orthogonal transformation on the variables y}, y2, ys amounts to the 
same transformation on f,,4, Bos, 834. We shall, therefore, suppose that 
this coordinate transformation in the space R’ has already been car- 
ried out, so that we have a,, = a,, = 0. Condition (23) shows that 
the coefficients of transformation (24) must satisfy the ordinary 
conditions for an orthogonal transformation. On recalling that the 
coefficients mentioned vanish, consideration of the second and third 
rows gives us the following conditions: 


of, + aj, +aj,=1, (k = 2,3) 
py M3) “7 Baz Agy + yg Ags = O, 


where all the coefficients present are real. By the conditions written, 
the vectors with components (a1, G2, Gg) and (a5, Gaz, O49) are mutually 
orthogonal and of unit length. If we choose these two as fundamental 
vectors in the space F, directed along the z, and x, axes, the two 
sums 

Ont Yr + Aye Yo + Hg Ya (k = 2,3), 


expressing the scalar products of the two vectors with the variable 
vector (¥, Yo, ¥3), reduce simply to the forms y, and y,, i.e. with this 
choice of coordinate axes we have: 


Qp2 = Agg = 1; gy = Ogg = Og, Ogn = 0. 


With the axes chosen in the two spaces, the matrix of transformation 
(24) takes the form 


| Ay, B49, Arg, Aya | 
‘| 1 H 
hO, 0, 1, 0 |! 
I, Oars Ogos Oyss Cae 
This matrix has been obtained as a result of multiplying the original 
matrix by two orthogonal transformations of only the first three varia- 
bles, though they can evidently be looked on as transformations of 
the four variables, the fourth variable being kept constant. Since 
the product of two orthogonal transformations must also be or- 


thogonal, we can say that the elements of (25) also satisfy the 


200 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF Groups [54 


orthogonality condition. On writing down this condition for the first 

row with respect to the second and third, we get 
2. = a3 = 0, 

and similarly, for the fourth row with respect to the second and third: 
Agyg= 2g. = Q. 

We thus arrive at the following matrix: 

41,0, 0, a4 

0, 1,0,0 

0, 0,1,0 

Ag, 0, 0, Aga | 


i.e. we have in this case the linear transformation 


Yi = 191 + a Yar 
Ys = Og Ys + Ogg Yo 
which has to satisfy the condition 


Yi + ye =¥it Y- 


We dealt with this transformation above; it led us to the special 
Lorentz transformation (15), and (22) can thus be taken as proved. 
We just notice that the sign rule is the same in defining transformation 
8 as previously, if the general Lorentz transformation T is required 
not to change the direction for measuring time and also to have a 
determinant greater than zero. We can always regard the orthogonal 
transformations U and V as rotations of three-dimensional space, 
so that their determinant will be greater than zero; at the same time, 
they in no way affect the fourth variable. We can thus conclude 
that transformation S must also have a determinant greater than 
zero, whilst it must not affect the measurement of time, i.e. given our 
assumption regarding the general transformation 7’, we arrive at 
precisely the conditions for the special transformation under which 
our formulae were derived. The general transformation satisfying the 
two conditions postulated above is usually termed a positive Lorentz 
transformation. It follows from the above discussion that the matrices 
corresponding to these are given by (22), where S is the particular 
Lorentz transformation of type (15) and U and V are the matrices 
of rotations of threc-dimensional space. Positive Lorentz trans- 
formations, like transformations (15), can be shown to form a group. 


55] PERMUTATIONS 201 


The above arguments show that the matrix of the most general 
Lorentz transformation, defined only by condition (21), can be writ- 
ten in the form (22), where U and V are rotations and S is the general 
Lorentz transformation in two variables. If this is a positive trans- 
formation, it follows at once from (15) that D(S) = 1, and the deter- 
minant of every positive Lorentz transformation is also equal to 
unity, inasmuch as the determinants of U and V are unity, matrices 
U, S, and V being considered as of the fourth order. As may easily 
be seen, the determinant can be equal to (--1) in the general case of 
a second order Lorentz transformation, so that the general trans- 
formation will likewise have a determinant of (+1). 


55. Permutations. We have so far considered examples of groups 
whose elements are linear transformations. There is no essential 
connection between the group concept and linear transformations, 
and we can construct groups for other types of operation. Our next 
discussion concerns permutations, representing a type of operation 
that we have already encountered in [2]. We must first mention some 
basic facts and concepts in regard to permutations. 

Suppose we have 7 objects, which we enumerate as in [2], i.e. we 
can simply suppose that the objects are the integers 1,2,... 2. 
As we know, 7! permutations are possible of these numbers. We 
take one such permutation: 


Py Pz --->Pn- (26) 
This set of p, yields all the numbers from 1 to n, arranged in a definite 


order in accordance with (26). We compare permutation (26) with 
the basic permutation 1, 2, ..., 2: 


i 2, +--+, 7 
Pir Par -++> Ph 

The passage from the basic permutation to (26) is accomplished 
by replacing 1 by p,, 2 by p,, and so on. We denote this operation by 
the single letter P and refer to it in future as a permutation. We 
now define the inverse permutation P-1. This is the operation such 
that (26) becomes the basic permutation, i.e. p, is replaced by 1, 
p, by 2, and so on. We can explain this by means of an example: 
suppose n = 5 and we have the permutation 


1, 2,3, 4,5 
|(). 
3, 2,5,1,4 


| (P). (27) 


202 ‘THE BASIC THEORY OF GROUIS AND LINBAR REPRESENTATIONS OF GROUPS [55 


The inverse permutation will be 


( 2,3, 4, _) (P-}). 
4, 2,1,5,3 
It is easily seen that 
(Foy (28) 


We now introduce the product of permutations. Let P, and P, be 
any two permutations. Their product P, P, is defined as the result 
of carrying out first the permutation P, and then P,. For instance, 
if we have the two permutations 


1, 2,8,4,5) 4 (p24) p) 
La a) ae ead ae 


their product P, P, gives the permutation 


i 2,3,4,5 


P,P). 
pear 2F) 


Obviously, the inverse permutation P-' is fully defined by the 
condition 


P= PP =I, (29) 


where I denotes the identity permutation, in which each element is 
replaced by itself. 

We can define the product of any number of permutations by apply- 
ing them successively. A product evidently satisfies the law of associa- 
tion, e.g. 

P3(P, Py) = (Ps P2) Py- (30) 


For we can either first form the product of P, with P,, then form 
the product of this with P;, or else we can replace the successive 
application of P, and P, by the application of the single permutation 
(P; P,) whichis equivalent to P, and P, applied successively. We finally 
notice that the identity permutation clearly satisfies 


IP =PI=P, (31) 


where P is any permutation. Products of permutations do not in 
general satisfy the commutative law, ie. the permutations P, P, and 
P, P, are in general different. We suggest that this be verified for 
the above example. 

We have thus established the basic ideas of inverse and identity 
permutations and products exactly as was done previously for linear 


55] PERMUTATIONS 203 


transformations (matrices). We can now continue the analogy and 
establish the further concept of group. A set of permutations forms a 
group if the following two conditions are fulfilled: firstly, if a per- 
mutation belongs to our set, its inverse also belongs to the set, and 
secondly, the product in any order of two permutations belonging 
to the set also belongs to the set. As in the case of linear transforma- 
tions, the identity permutation necessarily belongs to the set. 

The set of all n! permutations clearly forms agroup. We now establish 
the existence of another group consisting only of part of the above. 
We observe for this that any permutation can be obtained with the 
aid of transpositions [2], different numbers of these being possible 
for a given permutation though the number is always even or always 
odd. The permutations resulting from an even number of transposi- 
tions themselves form a group. The group formed by all the per- 
mutations is usually termed symmetric, whilst the even permutations, 
i.e. those resulting from an even number of transpositions, form an 
alternating group. 

We now consider permutations of a special type. Let 4, ),, ...,lm 
be any m different integers from the integera up to n. Suppose our per- 
mutation consists in replacing 1, by /,, 1, by ls, ..-,lm-y by Im, and 
finally, 2,, by 4. A permutation of this type is called a cycle and 
written (4,1, ..., lm). Cyclic permutations of the numbers inside the 
brackets give us the cycles 


(bap bagy a2 vy Eg Oyo Cle Ogg ao 9 pes 5 ba) pe 


which evidently yield the same permutation as (1,4, ...,%,). If 
m = 1, ie. we have the cycle (1,), the cycle is clearly equivalent to 
the identity permutation, and there is no point in considering it. 
A cycle of two numbers (1,,1,) is obviously equivalent to transposition 
of the elements J, and ,. 

If we have two cycles with no common elements, their product is 
independent of the order of the factors. 

Suppose say n = 5, and we take the product of the two cycles with 
no common elements, 


(1, 3) (2,4,5) and (2, 4, 5) (1, 3). 
Both these products clearly yield the same permutation 


(1, 2,3,4,5 
3,4,1,5,2 


204 THE BASIC THEORY OP GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [55 


We can represent any permutation P as a product of cycles having 
no common elements. To do this, we take the element 1 as the first 
element in a cycle. We take as the second element that which is ob- 
tained from 1 with the aid of P. Let this be /,. We take as the third 
element that which is obtained from /, with the aid of P, and so on, 
until finally we arrive at the element which becomes 1 with the aid 
of P. This will be the last element of the first cycle. It may easily 
be seen that this cycle cannot contain identical elements. The cycle 
thus composed does not in general exhaust all n elements. We take 
any one of the remaining elements—aa first element of a new cycle 
and form this as above, and so on. : 

We take as an example the permutation with n = 6: 


1, 2, 3,4,5,6 
18, 6, 4,1, 2,5) 


By using the above method, we can write this as the product of 
the cycles 
1, 2, 3, 4, 5,6 
3,6, 4,1,2,5 
the order of the factors on the right being of no consequence. 

The product of two transpositions is readily expressible as a product 
of third degree cycles. If the second degree cycles (transpositions) 
have no common elements, it may easily be shown that 


(ls, L,) (4, I.) = (4, L, 14) (,; lb, L,), 


whereas with common elements: 
(1, Is) (2, T,) = (1, dp, q,) : 


Thus every permutation of an alternating group can be written 
as a product of third degree cycles. 

We also note that the numbers in the first row of a permutation can 
be written in any order. The only thing that matters is that each 
of these should have below it the number which it becomes as a result 
of the given permutation. For instance, the following are forms of 
the same permutation: 


: 2, 3,4, | _ ig 1,5,4, ; 


= (1,3, 4,) (2,6, 5), 


3, 2,5,1,4 5, 3,4, 1,2 


Given the permutation 
P= GQ, Qp,---, Gy, 
by, be, rer, by 


3 


56] ABSTRACT GROUPS 205 


we can obviously write the inverse permutation as 


By, be, -- +s 2 


a; Q, be deca J Bp: 


PAs 


Suppose we have two permutations, the second being written in 
two ways: 


esta Ey, Q= sry Cy, Cay + - + On 
Pal Cay +--+) * ae i) 1» has sey A 
We have 
Pq-1 fa! I, 2, sees |e d., a 46's'5 on) = [% d., oo | 
» & 


ae 1, 25.3329 % Cy, Cay +--+; 
and consequently 
Qpg-1 = (°v oo) ( es ee oe), 
his he ae, C1, Coy ---5 Cr fis fe Seay I 


The following rule follows from this equation: to obtain the permuta- 
tion QPQ@-1, carry out the permutation Q in both rows of 


1, | 


P| 


64, 6g, ---s Gp 


56, Abstract groups. When defining a group, we can completely 
disregard the concrete interpretation of the operations, the set of 
which forms the group, and which we have previously taken to be 
linear transformations or permutations. We arrive in this way at an 
abstract group. 

An abstract group is a set of symbols for which multiplication is 
defined, in the sense that there is a definite rule by which two elements 
P and Q (the same or different) of the set yield a third element, also 
belonging to the set, which is called their product and written QP. 
The following three conditions must be fulfilled here. 

1. Multiplication must obey the associative law, i.e. (RQ)P = R(QP), 
whence it follows that, in general, any number of factors in a product 
can be grouped together, without, of course, changing their order. 

2. There must be one and only one element E in our set which, when 
multiplied on either side by any other element, yields the same element 
i.e. 


EP =PE=P. (32) 
We shall call E the identity or unit element. 


206 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OP GROUPS [56 


3. For any element P of the set, there exists a further unique element 
Q of the set which satisfies the condition 
Q@P=PQ=H (Q=P-}). (38) 


With P = E, (32) gives HE = E, i.e. the inverse of His # by defini- 
tion of inverse element (H-1 = £). 

These conditions defining an abstract group can be put in a more 
compact form with more restricted requirements in which case the 
restricted requirements imply the remainder as necessary formal 
consequences; we shall not dwell on this, however. On the whole we 
shall confine ourselves to the elementary basic facts regarding abstract 
groups. A detailed treatment of the theory of groups provides enough 
material to fill a separate volume. Our aim is simply to familiarize the 
reader with basic concepts and facilitate his reading of the literature 
of physics, where the group concept and the fundamental properties 
of groups are frequently utilized. Below, we shall occasionally write I 
instead of H. The element @ defined by equations (33) is called the 
inverse of P and is written P-!. Equation (28) is clearly valid, since it 
follows from (33) that P is the inverse of Q. 

Having established the concept of abstract group, we explain next 
some fresh concepts and also prove some properties of abstract 
groups. We first of all notice that the number of elements in a group 
can be either finite or infinite, as we saw above. Suppose we take the 
product of elements of a group 

ROP. 

This is also an element of the group. The inverse is obtained exactly 

as in the linear transformation group, i.e. it is 


(RQP)-! = P-1Q1R-. 


This is easily seen by carrying out the multiplication and using 
the associative law. Given an element P of the group, its positive 
integral powers 

Prat, PLP a. 
are likewise elements of the group. If there exists a positive integer 
m such that P™ =I, the element is said to be of finite order, the 
order of an element being the least positive number m for which 
Pm =f. There can be no identical elements among 

EPPA iat Pr ty 


For it follows at once from the condition P* = P! (k < 1) that P-* = 
=. All the elements of a finite group are clearly of finite order. 


56] ABSTRACT GROUPS 207 


Let us write P, for elements of the group. If the group is finite, a 
can be assumed to take a finite number of positive integral values. 
If the group is infinite, it can take all integral values [52], it can 
vary continuously, or it can even be equivalent to several subscripts 
which vary continuously. Let U be a fixed element of the group. 
We form all the possible products UP,. We can easily show that, as 
the subscript a varies, the product again gives us all the elements 
of the group, without repetition. 

For on multiplying on the left by U-1, the equation 


UP,, = UP x 


gives us at once P,, = P,,, ie. the products UP, must be different for 
different a. To show that the product can become any element of 
the group, we take UP, = P,,, which is eqivalent to P, = U-! P,,, 
i.e. UP, in fact gives us the element P,, when the factor P, is equal 
to the element U~1 P,, of the group. We should have the same result 
if the fixed element U were written on the right instead of the left. 
We thus arrive at the following: if P, varies over all the elements of the 
group and U is a fixed element, the product UP, (or P,U) likewise 
varies over all the elements, without repetition. 

We take as a particular example the group consisting of six elements 
(a sixth order group), the elements being denoted by 


E, A, B, C, D, F. 


We define the multiplication rule with the aid of the following 
table: 


| EABCDF 
BE|EABCODF 
4|4EDFBC : 
Bi BFEDCA (34) 
C|CDFEAB 
D|DCABFE 
F|FBCAED 


The table must be used as follows. Suppose we want to find the 
product DB, we look for B in the first row and D in the first column 
then find A at the intersection of the corresponding column and 
row, so that the product DB is A. All the conditions laid down in the 
definition of abstract group may readily be seen to be satisfied here, 
the role of the identity element being played by Z£. 


208 ‘THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [57 


We have met with concrete interpretations of the abstract group 
concept in previous examples. In one case the role of element was 
played by linear transformations (their matrices) and multiplication 
of two elements amounted to successive application of two transforma- 
tions, i.e. to multiplication of the corresponding matrices. In another 
case permutations played the part of elements and multiplication of 
two elements amounted to successively carrying out two permutations. 
We shall now mention a further concrete interpretation of the ele- 
‘ments of a group. 

Let the elements be all the complex numbers and let multiplication 
of two elements amount to addition of the corresponding complex 
numbers. In this case the role of identity element is played by zero, 
whilst the inverse element to the complex number a is (—a). Instead 
of the complex numbers, we could have taken as elements all the vectors 
x(2, %, .--, Zn) Of complex n-dimensional space BR, and defined the 
multiplication of elements as addition of the corresponding vectors. 
Here the null vector plays the part of identity element. We can say 
alternatively that the group elements are the vectors of R, whilst 
the group operation is vector addition. We notice that in the last 
two examples the result of multiplying two elements of the group is 
independent of the order of the factors, i.e. as we say, any two elements 
of the group commute. A group of this type is called Abelian [cf. 
45]. The simplest example of Abelian group is the cyclic group which 
consists of the identity element HZ and powers of an element P. If 
m is the least positive integer for which P” = E, the cyclic group 
has m elements: E, P, P?, ..., P™~?. If there is no such positive 
integer m, the cyclic group is infinite: FE, P, P*, ... 


57. Subgroups. Suppose a set H of only part of the elements of a 
given group G likewise forms a group, the above definition of multiplica- 
tion being preserved. In this case the group H is said to be a sub- 
group of G. The set consisting simply of the identity (unit) element 
of G clearly always forms a subgroup. This is a trivial case that we 
shall overlook when speaking in future of subgroups. 

We write H, for elements of the subgroup H, and let G, be a given 
clement, not belonging to H, of the total group G. As seen above, the 
products G, H, give various elements of G; these elements do not 
belong to H, since otherwise we should have for certain values a, 
and a, of the subscript a: G, H,, = H,,, whence G, = H,, HS, 1.¢. 
G, must belong to H, which contradicts our assumption. Now let G, 


57] SUBGROUPS 209 


and G, be two different elements of G not belonging to the subgroup H. 
We show that the sets of elements G, H, and G, H, either have no 
common elements at all or else coincide, i.e. consist of the same ele- 
ments. For suppose we have G, H,, = G, H,, for certain values of a; 
it follows that G, = G, H,, H-., = G, H,,, ie. G, belongs to the set 
of elements G,H,, and similarly G, belongs to the set of G, Hj. 
Hence the products G, H, and G, H, define the same set of elements. 

We take all the elements H, of the subgroup H. These do not exhaust 
the elements of G. We take some element G, not belonging to H and 
form all the products G, H, which, as we have seen, all differ from 
each other and from the H,. 

It can happen that the H, and G, H, do not exhaust the whole 
group. We take an element G, not belonging to the H, or G,H, 
and form all the products G,H,. As we have seen, the elements 
G, H, all differ from each other and from the H, and G, H,. Ifthe 
elements H,, G, H, and G, H, do not exhaust G, we take an element 
G, not belonging to any of the above three sets and form the products 
G, H.. Hence we get further elements of the group, and so on. Suppose 
we exhaust the elements of G by means of a finite number of such 
operations. Let (m — 1) elements G, be required for this. All the ele- 
ments of G are now represented as follows: 


H,, G, H,, G, H,, Ord ee Girdles (35) 


where the subscript a varies over values corresponding to the sub- 
group H. If we set Gi = G, H,,, where a, is fixed in some manner, 
the set of elements Gj, H, will coincide, as shown above, with the set 
G, H,. In other words, in every set G,H, (Gy = J) any element 
of the set can play the role of G,. Hence it follows at once that, for any 
given subgroup H,, the division of the elements of group G into sets 
of type (35) is fully defined. The G, H, are called cosets with respect 
io the subgroup H,. 

In the case of (35), H is said to be a subgroup of finite inder, and 
is in fact a subgroup of index m. If G is a finite group, the index of 
the subgroup H is clearly equal to the quotient of the order of G and 
the order of H, the order of a finite group being defined as the number 
of elements contained in it. We notice that only the first of sets (35) 
forms a subgroup. The remaining sets G,H, do not contain the 
identity element and therefore cannot form a subgroup. 

We formed scheme (35) by multiplying the elements H, of the 
subgroup H on the left by elements G, of group G. We could multiply 


210 ‘THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPs [57 


on the right; changing the notation from G;, to Gj, we arrive at the 
following new representation of the elements of G: 


[ORS | ey 2a: a : ee (36) 


where it will be shown that the index m of the subgroup remains 
unchanged. The sets of elements G,H, are sometimes called left 
cosets whilst the H, Gj, are right cosets. 

We first of all observe that, if a varies over all the values corre- 
sponding to the subgroup H, the elements Hz" give all the elements 
of H. This follows directly from the fact that the inverse of a given ele- 
ment of H also belongs to H. We now turn to the proof that the indices 
for right and left cosets are the same. We take any two different 
sets G, H, and G, H, (p # q) of (35), where we can suppose for the 
firat of sets (35) that say G, — EZ. We take the inverse elements: 


(4,4,)'*=Ho'G;* and (G,H,)7=H Gr". 


Bearing in mind the remarks made above, we can re-write these sets 
of elements as H, G;* and H, G,;. They are easily seen to have no 
common elements. For suppose we had 


H,,G;' = 4, Gz’, 
it would follow that 
G>? G, = Ao A, =A OF G, =GA 


pitas? 
and G, would belong to the set G, H,, which is impossible. Hence 
the sets 

Bog Gy) Gs) ees5 Ges 


are seen to be right cosets, so that we can simply take Gi = G;’ 
in (36). 

We consider some examples of subgroups. Let G be the set of real 
orthogonal transformations in three variables and H the set of real 
orthogonal transformations in three variables with (+1) determinant. 
Every real orthogonal transformation is either a rotation, i.e. belongs 
to H, or is the product of a rotation and the symmetric reflection 
relative to the origin given by 


v=—z yo=—y, 2=—2z. (8) (37) 
The present group G can be represented by the scheme 


H,, SH, (38) 
or 
H,, H,8, (39) 


58} CLASSES AND NORMAL SUBGROUPS 211 


where H, denotes the set of all elements of the group H. Here, H, 
is a subgroup of index 2 

Let G be the symmetric group of permutations of n elements, and 
H the alternating group made up of even permutations. Further, 
let S be any given odd permutation, say the permutation consisting 
of the single cycle (1, 2), i.e. amounting to transposition of the ele- 
ments | and 2. It is clear that here also we can represent G by scheme 
(38) or (39). In both cases, multiplication on the left leads to the same 
result as multiplication on the right. 

Here, the alternating group is a subgroup of index two of the 
symmetric group. 

We also consider the finite group of the regular octahedron that 
we discussed above. Let J be the axis passing through a given vertex 
A of the octahedron. Let S8,, S,, S,,S, be the rotations about this 
axis by angles 0, 2/2, z, and 32/2. These rotations form a subgroup 
of the total group of rotations of the octahedron. Let 7, denote 
the rotations displacing A to the remaining five vertices (k = 1, 2, 
3, 4, 5). We can write the complete octahedral group as 


Sa, T,Sq T:S2, T3Se T1Sy Ts Ser 


i.e. S, is a subgroup of index six. 

Let G,, G-} be elements of group G (s = 1, 2, ..., &). We consider 
the set of all elements of G expressible as products of G,, Gy* (s = 
=1,2,..., 8). 

This set clearly forms a group which is a subgroup of G or else 
coincides with G. 

This subgroup is said to be generated by the given set of elements 
G,, Gy? (s = 1,2, ..., 2). 


58, Classes and normal subgroups. Let U and V be elements of a 
group. The element W = VUV~— is said to be conjugate to U. It is 
easily seen that, conversely, U is conjugate to W. For U = V-43WYV. 
Two elements U, and U,, conjugate to a third element W: 


U,=V,WV7; Us=ViWVez, 
are also conjugate to each other: 
U,=V,VztU, (V2, Vz"). 


The set of all mutually conjugate elements of a group forms what 
is known as a class of the group. A class is fully defined by one of its 


212 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GRouPS [58 


elements U. For, given U, we get the entire class by the expression 
G, UGz", where G, varies over all the elements of the group. We can 
thus divide the total group into classes. Bearing in mind the basic 
property of the identity element, given in [56], we have 


G,1G =I, 


i.e. the identity clement itself forms a class. 

If the element U is of order m, i.e. m is the least positive integer 
for which U™ = JI, every conjugate element G, UG-4 has the same 
order m, as follows at once from the equations 


(G,U0Gz3)" = G,U"Gp) =I. 


In other words, all elements of the same class have the same order. 

We remark that when G, runs over all the elements of group G, 
the product G, UG;* can give the elements of the class more than 
once. For instance, if U = TJ, the product always gives J, as we 
have seen. 

We again take as an example the octahedral rotation group. Let 
U be the rotation by z/2 about an axis A,A, of the octahedron. If 
the rotation T;,, belonging to the group, transforms the axis / to the axis 
,, the vertex A, being transformed to A, and A, to A,, the group 
element 7';,UT;' yields a rotation by 2/2 about the axis A, A,. If, 
for instance, 7, transforms A, to Ag, the product here yields a rotation 
by 2/2 about the axis 4A,A, or, in other words, a rotation by 32/2 
about A,A,. If 7, transforms A,A, into itself, i.e. if the rotation is 
about this axis, the product 7,U7;,* coincides with U. Thus in the 
present case, the class of elements conjugate to U consists of the set 
of rotations by 7/2 about axes of the octahedron. 

Similarly, taking the group of rotations of three-dimensional space 
about the origin, we know that every group element U consists of 
a rotation by some angle p about a certain axis. Here the class of 
elements conjugate to U is the set of rotations by the angle g about 
all the possible axes passing through the origin. 

We now discuss another important concept, that of a normal sub- 
group; this is closely connected with the idea of class. Let H be a 
subgroup of a group G, and G, a fixed element of G. We consider the 
set of group clements given by 


GH, Gy", (40) 


where H, denotes a variable element of the subgroup H, in other 


58] CLASSES AND NORMAL SUBGROUPS 213 


words, H, runs over all the elements of H. Products (40) may easily 
be seen also to form a subgroup. For if we take say the product of 
two elements belonging to the set (40), it also belongs to this set: 


(G,H,, Gr’) (G, Z,, Gz?) i G,H,, Aa, yt = Gy Gos Gy, 


and the other conditions for group formation are similarly fulfilled. 
Subgroup (40) is said to be conjugate to subgroup H; if G, belongs 
to H, (40) also consists of elements belonging to H, and as is easily 
seen, simply coincides with ZH. 
Every element H,, of subgroup H can in this case be obtained 
from (40), if we take 
H, = Gr H,, Gy. 


If the element G, does not belong to subgroup H, subgroup (40) 
can be different from H. 

Subgroup Z is called a normal subgroup of the total group G if, 
for any choice of element G, of the total group G, subgroup (40) 
coincides with H. We now discuss some new concepts connected with 
normal subgroups, examples of which will be given later. 

Let subgroup H be a normal subgroup of the total group G. We 
shall simplify the writing by assuming this subgroup to have a finite 
index m. In this case, all elements of group G can be represented by 


H,, G,H,, G,H,, -.-, Gn H., (41) 


where, as usual, H, is a variable element of H. Since H is a normal 
subgroup, the set of elements G, H,G,* coincides with the set of 
elements H,, i.e. the set of elements G, H, is the same as the set 
of elements H, G;.. 

Hence, if H is a normal subgroup, the division of the group elements 
into conjugate sets in accordance with (41) is the same as the division 
into conjugate sets according to the scheme 


H., HaGy, HaGy +++» Hy Gna (42) 


In other words, in this case right cosets are the same as left cosets. 

Given any element H,, of the normal subgroup, the element 
G, H,, Gy", for any choice of G, from G, also belongs to the normal 
subgroup, i.e. if an element belongs to the normal subgroup, the entire 
class in which the element appears in the basic group also belongs 
to the normal subgroup. We can easily prove the converse: a subgroup 


214 ‘THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [59 


is a normal subgroup if, on containing an element, it contains the 
entire class to which the element belongs in the basic group. 

We now return to the cosets of (41) or (42), where the elements 
H, make up the normal subgroup H. We consider the products 
G, H, G, Hz of elements of a coset G,H, with elements of the coset 


G, Hy. 
We can write the set of these products as 
G, (HG) He, 
G,6,H,H,. 


The elements H, and H,. belong to the normal subgroup H, and the 
same can be said-of their product. Hence we can write the above 


products as 
G,G, H,. 


All elements of this type are included in the same coset of (41), 
viz., the set to which the element G,G, belongs. It is also easily 
shown that in this way we get all the elements of this coset. In short, 
if a subgroup is a normal subgroup, multiplication of one conjugate 
set by another likewise gives a coset. We shall look on the cosets 
as new elements, the first set in scheme (41), (38) being taken as the 
identity element. The above result regarding the multiplication of 
cosets gives us a multiplication rule for these new elements that we 
have introduced. We propose to the reader the simple proof that this 
multiplication rule satisfies all the requirements for group formation, 
ie. given this rule, our new elements themselves form a group, in 
which the first coset of the scheme plays the part of identity element. 
This new group, the order of which is equal to the index of the normal 
subgroup d, is said to be complementary to H or is called the factor 
group relative to H. 

Every group G@ has two trivial normal subgroups: the identity 
element by itself, and the entire group G. 

We shall in future assume a non-trivial case when speaking of 
normal subgroups. It may happen that a group has no normal sub- 
group. 

The group is said to be simple in this case. 


59. Examples. 1. We take the group @ of real orthogonal transformations 
in three-dimensional space. Let H be the subgroup of rotations, i.e. the set of 
orthogonal transformations with (+1) determinant. Further, let S be the 
symmetrical reflection about the origin defined by (37). If H, is a variable 


58] EXAMPLES 215 


element of H, the total group G can be represented as 
H,, SHzg or Hy H,8. (43) 


If G, is any transformation of G, G,H,G-} has a (+1) determinant, i.e. 
belongs to H, and H is a normal subgroup of index two. We consider the comple- 
mentary group to H. The identity element E of this group corresponds to the 
first of sets (43). The product of two elements of the second set, i.e. of two 
orthogonal transformations with (— 1) determinant, yields an orthogonal trans- 
formation with (-+-1) determinant which belongs to the first set. If K is the 
element corresponding to the second set, it follows from what has been said 
that K? = HZ. Thus the complementary group to H consists of the two elements 
Hand K, and K* = E, ic. it is a cyclic group of order two. This is true in general 
for normal subgroups of index two. 

2. For the symmetric group of permutations, the alternating group is a normal 
subgroup of index two. 

We write down the elements of the symmetric group with three elements 
and denote each by a single letter, using the notation of [55]: 


E; A=(2,3); B=(1,2); O=(1, 3); D=(1,3,2); P=(1,2,3). 


The alternating group consisting of permutations Z, D, EH, is a third order cyclio 
group (F = D? and D = F*), where Di = F? = E£. The total symmetric group 
consists of three classes: I #; IL A, B, and C; II D and F. 

The alternating group also consists of three classes: I #; If D; II F. It is 
easily verified that the multiplication rule for elements of the symmetric group 
in question is that defined by table (34) of [56]. 

The alternating group with n = 4 contains twelve elements which are distrib- 
uted in four classes; 


LH; UW 4,=(1,2)(3,4); 4,=(1,3)(2,4); 4,=(1,4) (2,3); 
I B,=(1,2,3); B,=(2,1,4); B,=(3,4,1); B,=(4,3, 2); 
IV C, = (12,4); C,=(2,1,3); C,=(3,4,2); C,=(4,3, 1). 


The second class contains three second order elements, whilst the third and 
fourth each have four third order elements. The product of two elements of 
the second class may easily be seen to yield an element again of the second class, 
and since all second order elements fall into the second class, we can say that 
these three elements form, together with the identity element, a normal sub- 
group of the alternating group. Its order is four, whilst the index is three. It 
is easily verified that elements B; of the third class fall into one of the cosets 
of elements with respect to this normal subgroup, whilst the elements C; are 
in the other coset. It may further readily be seen that the product of two third 
class elements yields a fourth class element, whilst the product of two fourth 
class elements yields a third class element. The identity element # in the 
complementary group corresponds to the above normal subgroup. Let A and 
B be the two other elements of the complementary group. It follows immediately 
from what was said above that 4? = B and B* = A, and it is at once clear 
that the complementary group consisting of the elements Z, A and A?, where 
A3 = EF, is a cyclic group of the third order. 


216 THE BASIO THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [60 


We observe that the elements HZ, (1, 2, 3) and.(2, 1, 3) of the basic alternating 
group form a cyclic subgroup of the third order, though this subgroup is not 
an elementary divisor. 

On enumerating the vertices of a tetrahedron in any order, it is easily verified 
directly that the above alternating group with n = 4 corresponds to the rotations 
that displace the tetrahedron into itself. Every permutation defines a passage 
from one vertex to another. Rotations of 22/3 about an axis of the tetahedron 
correspond to permutations of the third class, whilst opposite rotations by the 
same angle about the same axes correspond to permutations of the fourth 
class. For instance, rotations about axes passing through vertex number four 
correspond to the permutations (1, 2, 3) and (2, 1, 3). Rotations such that no 
vertex remains unchanged correspond to the permutations of the second class. 

The alternating group can be shown to be simple with n > 4. 

3. If H is any subgroup of an Abelian group G, we have G,H, = H,G, 
for any choice of elements H, of H and G,-of G, ie. G,H, G,1— H,, whence 
it is at once clear that H is a normal subgroup, i.e. every subgroup of an Abelian 
group is a normal subgroup. We take as an example the group G of vector 
addition in #, which was mentioned in [49]. 

We take for the subgroup H the vectors belonging to some subspace Ly, 
of R, (0 <k <7). The cosets are obtained by associating with any vector 
x of &, all the vectors of the subspace Ly. 

If x belongs to Z,, the coset is the same as the subgroup H. We introduce 
fundamental vectors x, x, ..., x into L, and fundamental vectors x!**D, 
---, x” into the complementary subspace M,,_,. By what has been said above, 
every coset of elements consists of the vectors: 


x9) 4+ eo x@ 4+... tex Ley. xhtD 4... te, xt, 


where ¢;41,-..,¢, have fixed values and ¢,, cs, ..., ¢, are arbitrary. 

We can thus associate with every coset a definite vector of 1,_,, and 
conversely, a definite coset corresponds to every vector of M,,_,. Corresponding 
to addition of two vectors of any two cosets we have the addition of the cor- 
responding vectors of M,_,. In other words, ir the case of the above group 
operation (vector addition), we can regard the vectors of M,,_, as elements 
of the complementary group. 

In this example, the order of the normal subgroup H and its index are in- 
finite. 


60. Isomorphic and homomorphic groups. Two groups A and B are 
said to be isomorphic if a one-to-one correspondence can be established 
between their elements, i.e. to any element of A there corresponds 
one element of B, and vice versa, the correspondence being such 
that the product of any two elements of A corresponds to the product 
of the corresponding elements of B. If A and B are isomorphic abstract 
groups, they have precisely the same structure, i.e. they have no 
essential differences. 


60] ISOMORPHIC AND HOMOMORPHIC GROUPS 217 


We now turn to @ new concept that represents a generalization of 
group isomorphism. Group B is said to be homomorphic to group A 
if to every element of A there corresponds a definite element of B, 
and to every element of B at least one element of A, the correspond- 
ence being such that to the product of two elements of A there cor- 
responds the product of the corresponding elements of B. The present 
case differs from that of isomorphism in that the correspondence 
need not be one-to-one both ways, i.e. the same element of B 
can correspond to several different elements of A. If B is homo- 
morphic to A, and one definite element of A corresponds to each 
element of B, the groups are also isomorphic. We observe moreover 
that if the elements B, and B, of B correspond to A, and A, of A, the 
element B, B, corresponds by definition to the element A, A, of A. 

Let A, be the identity (unit) element of A and B, the corresponding 
element of B. We can easily show that B, is the identity element of 
B. For, we have for any A, of A: 


Ay A, = A, Ay = Ay, 
which leads to the equation for the corresponding elements of B: 


B, By = B, By = By, 


where B, can be assumed to be any element of B by the definition of 
homomorphism. The last equation shows that B, is the identity element 
of B. Thus the identity element of B corresponds to the identity ele- 
ment of A in isomorphic and homomorphic groups. We now take 
A, and its inverse 4-4 of A, and let B, and B, be the corresponding 
elements of B. The equation A, Ay’ = Ayj*A, = Ag, where A, is the 
identity element of A, gives by definition of homomorphic groups, 
B, B, = B, B, = B,, where B, is the identity element by the above; 
thus B, = By, i.e. inverse elements of B correspond to inverse ele- 
ments of A. 

Suppose that our groups are homomorphic, but not isomorphic. 
We consider the set of elements C, in group A to which the identity 
element B, of B corresponds. If B, corresponds to Cz, By? = By 
corresponds to Cz by the above and we have B, B, = B, correspond- 
ing to every product C,, C,,, i.e. the set of elements C, of A to which 
the identity element of B corresponds forms a subgroup C of A. 

We show that this subgroup is a normal subgroup. Let A, be any 
given element of A and B, the corresponding element of B. To every 
element of the form A, C, Aj’ there corresponds the element B, B, By’ 


218 ‘THE BASIO THEORY OF GROUPS AND LINRAR REPRESENTATIONS OF GROUPS] [60 


of B, this latter being B, in view of the basic properties of identity ele- 
ments; thus every element of the form A, C, Az? is one of the elements 
C,, ie. it belongs to the subgroup C, or in other words, C is a normal 
subgroup. We now consider the decomposition of A into cosets as 
follows: 

Cy A Cg AzC,, «+> (44) 


Let B, be the element corresponding to A,. We take two elements 
A, Cg, and A, C,, belonging to the same conjugate set. The correspond- 
ing elements to these are B, By and B, Bg, i.e. the same element 
B, of B. 

Elements B, and 3B, correspond to different cosets A,C, and 
A,C,. We show that B, and B, are different. If they were the same, 
the identity element B, of B would correspond to the element 4,74), 
i.e. this latter element would be one of the C, and we should have 
AA; = Cy,, ie. Ay = Ay Cg,, which contradicts scheme (44). Hence, 
if group B is homomorphic to group A, the set of elements C of A cor- 
responding to the identity element of B form a normal subgroup, and 
each coset to this normal divisor is a set of all the elements of A cor- 
responding to a given element of B. It follows directly from the defini- 
tion of homomorphic groups moreover, that we have, correspond- 
ing to the product of any two elements of different (or the same) 
cosets, the product of the elements of B corresponding to these 
sets, i.e. more briefly, a definite element of B corresponds to each 
coset of A, different elements of B correspond to different cosets, 
and the correspondence establishes an isomorphism between group 
B and the group in A complementary to C,. 

We again take as an example the group of real orthogonal trans- 
formations in three-dimensional space; we associate with each trans- 
formation a number equal to the corresponding determinant, and 
define multiplication in the domain of these numbers in the usual 
way, i.e. as numerical multiplication. Our group is now homomorphic 
to the group consisting of the two elements (+1) and (—1), multiplica- 
tion being defined in the usual numerical way for these two elements. 
The role of identity element is played by (+1). The normal subgroup 
in this example consists of the group of rotations. 

If group B is homomorphic but not isomorphic to group A, the set 
of elements of A to which the identity element of B corresponds is 
generally known as the kernel of the homomorphism. We have seen that 
the kernel of the homomorphism is a normal subgroup of group 4. 


61) EXAMPLES 219 


61. Examples. 1. We take the group @ of real orthogonal transformations in 
three-dimensional space and associate with each the number equal to the deter- 
minant of the transformation, the group operation for these numbers being defined 
as ordinary multiplication. The group G’, consisting of (+1) and (—1) with 
ordinary multiplication of these numbers, is now homomorphic to group G. 
The identity element (-+1) of G’ corresponds to rotations of three-dimensional 
space of G. These rotations form a normal divisor, whilst the complementary 
group is the cyclic group of order two [59]. 

2. We take an equilateral triengle in the ry plane with vertices 


(1, 0); (cos 120°, sin 120°); (cos 240°, sin 240°) 


and form the group G consisting of rotations of the plane about the origi 
by the angles 0°, 120°, and 240° which displace the triangle into itself, and of 
reflection in the z axis followed by rotation by the angles 0°, 120°, 240°. This 
is the dihedral group with 2 = 3. 

We write down all the matrices corresponding to the group elements: 


1, 0 1,0 —e°3!3 | 
sjoib 4-Pocaf 2-fae a fp 
vn gl353 
PA a ee | Ho. J 
| -s+—z¥5 | ope ek } —3 Zh 
C= 5 De=: WS Ff = 
ee |-liz,—2 ee 
Joes e cs a | 28> zy 


If we return to the multiplication scheme given by table (34) of [56], we 
see that this scheme in fact corresponds to multiplication of the matrices form- 
ing our group. We saw above [59] that this scheme of multiplication also cor- 
responds to the symmetric group of permutations of three elements: 


E; A=(2,3); Bea(i,2); C=(1,3); D=(1,3,2); F=(1,2,3). (45) 


Thus if we take as corresponding elements the elements of these two groups 
denoted by the same letter, the two groups are isomorphic. The permutations 
of group (45) correspond to permutations of the vertices of the equilateral 
triangle, if theso are numbered correspondingly. 

Following the same lines as our discussion of [59], the tetrahedral group is 
isomorphic to the alternating group with n = 4. 

3. A general method can be shown for constructing groups of permutations 
homomorphic to a given group G. Let H be any subgroup of finite index n 
of group G. We write down the cosets of this: 


H, HS,, H8,,..., HS,+- 


If we multiply each set on the right by some element S of G, the result is 
merely & permutation of the order of the sets, and we shall take it that this per- 
mutation in fect corresponds to the chosen element S of G. It may easily be 
shown that we obtain in this way @ group G’ of permutations homomorphic 
to group G. 


220 THE BASIO THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [62 


The necessary and sufficient condition for the identity element of G’ to 
correspond to the element S of @ is that, on multiplying on the right by S, 
every coset becomes itself, i-e. 


H,S =H, and H,S,S=HgS, (k=1, 2,...,2—-1), 


where H, is any element of H and H, also belongs to H. These equations can 
be re-written as 


S=H7' Hg, S= (Sz! H, S,)! (Si! HgS,), 


and it now follows that the necessary and sufficient condition for the identity 
element of G’ to correspond to the element S is that S belongs simultaneously 
to H and to every conjugate subgroup S;1 HS,. 

If H is 2 normal subgroup of G, the above requirement amounts to the fact 
that S belongs to H, and group G’ is in this case isomorphic to the complementary 
group. If H is the identity element alone, G is isomorphic to the group of per- 
mutations G’ which is obtained if the elements of G: 

E, 8,, Ss, ..., Sy 


are multiplied on the right by any given element S of G which reduces to some 
permutation of the elements of G. We shail discuss in more detail later the 
construction of groups of linear transformations isomorphic to a given group. 


62. Stereographic projections. Having now concluded the fundamen- 
tal general theory of groups, we consider a particular example of 
group correspondence that has an important role in physics. We 
start with a preliminary description of stereographic projections, that 
give a definite correspondence rule between 
points of a sphere and of a plane. 

We take xyz axes in three-dimensional 
space and a sphere C of unit radius with 
centre at the origin. Let S be the point of 
the sphere with coordinates (0,0, —1) and 
M a variable point on the sphere (Fig. 3). 
The straight line SM intersects the zy plane 
in a point P, so that we have a fully defin- 
ed correspondence rule between points of 

Fie. 3 the sphere C and points of the zy plane; 

the point corresponding to S (0,0, —1) on 

the sphere is the point at infinity on the plane. This point corre- 

spondence in fact gives the stereographic projection of the sphere on 
to the plane. 

We now deduce the expressions for the stereographic projection. 
Let JN be the perpendicular from M to the z axis. We have from 


62] STBREOGRAPHIC PROJECTIONS 221 
similar triangles, bearing in mind that SO = 1: 
NM =(1+0ON)OP. (46) 
Writing the coordinates of WM as (x, y, z) and of P as (a, B), we have 
NM =(1+2)0P, 


or, on projecting the parallel lines OP and NM on to the z and y 
axes: 


z=(1+2z)q; y=(1+2)8. (47) 


The equation z7-++ y2+ 22=1 gives us the quadratic equation 
for 2: 

(+P) (+22 +2=1, 
and we obtain on solving this: 
#1 (@+) 

1+ (a? + f?) 

But we must have z >—1 for all points (a, 8) at a finite distance, 
and consequently we must have (+1) in the above expression. On 
making use also of expressions (47), we get finally for (z, y,z) in 
terms of (a, 8): 

ya 2a te 2p pe GRR 
~T+ere ) YO TpepeR > PO Tee 

We introduce instead of the two real coordinates a, 8 on the plane 
the complex coordinate ¢ =a -+ i. On writing as usual ¢ for the 
complex conjugate of f, we can re-write the above expressions as 


2= 


(48) 


: 2 : 2 eee <5 
e+y= =i —ty= i = = 49 
TUS Tre? * 8s “The we 


We write ¢ as the ratio of two other complex numbers € and 7: 


_2 
Sa a (50) 


Pairs of values of £ and y, differing by a common factor, i.e. pairs 
of the form ki, ky and £, 7 give the same €, i.e. the same point of 
the plane, whilst the pair 7 #0, £=0 gives the point at infinity. 
The complex numbers and 7 are called homogeneous complex. 
coordinates on the plane. Using (50) and separating into real and 
imaginary parts, we can re-write (49) as 


ath. got bah, go Ham (51) 
EE nn’ t EE +n’ gE + 1 


222 THE BASIC THEOBY OF GROUPS AND LINBAR REPRESENTATIONS OF GROUPS [63 


These last formulae give us, for any complex £ and 9, real (z, y, 2) 
satisfying the relationship 


e2+y+2—-1=0, (51,) 
as is to be expected, since the point (z, y, z) lies on the unit sphere. 


63. Unitary groups and groups of rotations. We now consider a 
unitary transformation of the variables (£, 7): 


& =ab+ by, 7 =c&+dn, (52) 


where, since the transformation is unitary, we must have 


FF 47! =H +m. (53) 


The new values (£’, 7’) give us a new point on the sphere: 


; = & 7 + &' : rot En’ — 7 . es EE’ — 7 7’ : (54) 


PR + aa” eo py PE +07! 

We know that the determinant of unitary transformation (52) has 
unit modulus, so that it is given by a number of the form e’?. On 
multiplying all the coefficients of (52) by e~**, we get a unitary 
transformation with determinant 1. But £’ and 7’ are now also 
multiplied by e~”. This extra factor has no effect at all on C. We 
can thus confine ourselves to unitary transformations (52) with the 
assumption of a unit determinant, i.e. 


ad—be=1. (55) 


Even with this restriction, two transformations, with coefficients 
of differing sign, give us values of &’ and 7’ of differing sign, and we 
arrive at the same point C’ under both transformations. 

If we replace &’ and 7’ in (54) by their expressions (52) and take 
condition (53) into account, we see on using (51) that the variables 
(z’, y’,2’) are expressed as linear homogeneous polynomials in 
(z, y, z). By (53), the denominators are the same in (51) and (54), 
and variables (x, y, z) undergo the same linear transformation as the 
expressions 


a == in + &; v= (&j—&); w= b= 77 (56) 


under unitary transformation (52). We establish later the exact form 
of this linear transformation. 


63] UNITARY GROUPS AND GROUPS OF ROTATIONS 223 


We first of all establish the general form of unitary transformations 
(52) with unit determinant. The general conditions for unitary trans- 
formations yield [28]: 

atbd=0; c&+dd=1. 

On multiplying (55) by ¢ and using the first condition written, 

we get 


—bdd—beé =6, 
whence we have by the second condition, ¢ = —b or c = —b, and we 


can show similarly that d = a. Hence we can write all the unitary 
transformations with unit determinant as follows: 


eee (57) 

nf = — bE + an, 
where @ and 6 are any complex numbers satisfying the condition 
aa + bb=1. (58) 


We now write (56) with new variables 
uw + io’ = 28’ 7’; — ie’ = 2 sw HEE — 077, 
or, using (57): 
au’ iv’ = a? 2m — b? 26% — 2ab (EE — nif) 
u’ — iv’ = — B?2Ey + a? 2tn — 2ab (EE — 77) 
w’ = Gb 26 + ab 2k + (aa — bb) (EE — 77). 
On making the substitutions 
2 uti; Wjp=u—io; Haw 


and adding and subtracting the first two equations, we get expressions 
for (w’, o’, w’) in terms of (u,v, w), or, what amounts to the same 
thing, expressions for (x’, y’, z’) in terms of (z, y, 2): 

1 


x’ => (a? + a? — Be — B%) 2 + 


+i@+ Ba —b)y— (b+ a)2 

y =i@+P-G—w)et 89) 
+5 (8 +a+8 + b)y + i(ab—ab)z 

2’ = (ab + ab) x + i (ab — ab)y + (aa — bb)z. 


224 THE BASIC THEORY OF GROUPS AND LINBAR BEPEESENTATIONS OF GROUPS [63 


For every unitary transformation (57) there is a corresponding 
transformation of the zy plane, and this in turn, in view of the cor- 
respondence set up by the stereographic projection, gives a trans- 
formation of the sphere. 

Accordingly, (59) is a real transformation by virtue of which the 
equation 

2+y+2=) 
becomes 
ey Zt], 


But the linear homogeneous transformation (59) does not change 
the constant term 1, and consequently it must leave unchanged the 
left-hand side of the equation, i.e. 

ey 4 2? = ot ty? + 22, 

All these facts can be obtained directly from the form of (59). Having 
shown that expressions (59) yield a real orthogonal transformation 
in three variables, we now show that the determinant of the transfor- 
mation is always equal to (1). The determinant is a continuous func- 
tion of the real and imaginary parts of the complex variables a and 8, 
which must satisfy relationship (58). But the determinant can only 
have a value of (+1) or (—1), and in view of the continuity the value 
must be either always (+1) or always (—1). But with a =1 and 
6b =0 expressions (59) give us the identity transformation with 
(+1) determinant, i.e. the determinant of (59) is in fact always 
(+1). Linear transformations (59) thus represent a rotation of space 
about the origin. 

We now show that every rotation of space can be written in the 
form (59). If we set 


i ‘ _ 
a=e2; g=e??; b=b=0, 


i.e. we take the unitary transformation matrix 


oa? , 0 
A, = i ? (60) 
0, ez” 
expressions (59) give us 
2’ = zc08s p — ysin g, 
y =zsing+ y¥cos¢, (61) 
2 =z) | 


i.e. we get a rotation by the angle p about the z axis. 


63] USTITABRY GROUPS 4ND GROUPS OF ROTATIONS 225 


If we now take 


=a ». Se epi ee OF ae ge 
a=a=cosz; 6= ésins ; b=isin=. 


i.e. we define the matrix of the unitary transformation as follows: 


i ¥ . 9” 
Coss: — tsi f 
B=|| : |: (62) 
| —isins» COB > | 
(59) gives us 
ee 
y =y cosy — zany, (63) 


2g =ysiny +2cos yp. 


This is a rotation by the angle » about the z axis. 

But as we know from [20], every rotation with Eulerian angles 
{a, B, y} can be obtained as a result of rotation by the angle a about 
the z axis, followed by a rotation of 8 about the new z axis, followed 
by a rotation of y about the new z axis. 

If we write Z, for the third order matrix corresponding to trans- 
formation (61), and X, for the matrix of transformation (63), a rota- 
tion about the z axis by the angle a will be accomplished by the matrix 
Z,, and the new z axis will now be obtained from the previous one with 
the aid of this matrix. A rotation of B about the new z axis will be 
accomplished, as may readily be seen, with the aid of ZX; 27a, 
and these first two rotations are obtained by 


Z,Xg20'°2,=2, Xp. 
As above, a rotation of 7 about the new z axis is obtained by 
(Z, Xp) Z,(Z.X_), 
and finally the rotation {a, 2, y} is produced by 
(Z, Xp) Z,(Z, Xs)" (Z, Xs) 


or 
Z,Xp_Ly- (64) 


In the above arguments we have used the obvious fact that, if Z, 
is the matrix giving a rotation of g about the J axis, passing through 
the origin, and M is a matrix transforming / to 4, a rotation of » 
about J, is given by the similar matrix 


MZ,M-. 


226 THR BASIO THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [63 


We now remark that, if A, and A, are two unitary transformations 
(57), corresponding to which we have orthogonal transformations 
(59) U, and U,, the product A, A, will clearly correspond to U, U,. 
Thus by (64), the rotation fa, 8, y} will be achieved by the unitary 
matrix consisting of the product of the three unitary matrices 


shad ee 
ee cos 4, — ising | e 2, 0 
Il}! (65) 
B By ii y || 
}o, ffl] [-f8ima- Z| io, | 


Thus for every unitary transformation there is a corresponding 
definite rotation of three-dimensional space, and all rotations may be 
obtained in this way. The product of two unitary transformations 
corresponds to the product of the corresponding rotations. We can say 
that expressions (59) define the homomorphism of the group of 
unitary transformations with unity determinant with the group of 
rotations of three-dimensional space. 

We now consider what unitary transformations yield the identity 
transformation, i.e. the identity element in the rotation group. The 
third of expressions (59) gives us here 


ab=0; aa—bb=l, 
whence |a| = 1 and b = 0. Let a =e”. The first of (59) gives 
FM +e-M)=1. 


It follows at once that 6 = 0 or z,ie.@ = +1. 
We thus have two unitary transformations with matrices 
E=||" °| = Oy eee 
- —1, 


to which the identity element of the rotation group corresponds. 

Now suppose that two unitary transformations U and V give the 
same rotation. In this case, V-1 U will give the identity transforma- 
tion in the rotation group, ie. V-1U = # or (—£), ie. U =V or 
U = —V. We remark here that the (—) sign in front of a matrix 
means that the signs of all the elements must be changed. The above 
discussion shows that unitary transformations (57) only lead to the 
same space rotation when they differ simply in sign. Conversely, 
if they only differ in sign, they give the same rotation, as shown 


64) THE GHNBRAL LINEAR GROUP AND THE LORENTZ GROUP 227 


above and as also follows from (59). We can finally say that the 
rotation group is homomorphic with the unitary transformation group 
with unity determinant, the same rotations being obtained when and only 
when the unitary matrices differ only in sign. 

The matrices H and (—#) form a normal subgroup H of the group 
G of unitary transformations (57) with unity determinant. Any 
coset with respect to the normal subgroup H consists of two elements 
G, and (—G,), where G, is any element of the group. It follows at once 
from what was said above that the rotation group is isomorphic to the 
complementary group to H. 

Expressions (59) contain two complex parameters @ and b which 
must satisfy relationship (58). Each of the complex parameters con- 
tains two real parameters, 


a= a, + tag; b= by + iby, 
and (58) is equivalent to the following: 
af + +O += 1. 

Expressions (59) thus contain four real parameters which must 
satisfy a single relationship, i.e. (59) contain three independent real 
parameters, as must be the case for the rotation group. The parameters 
a and 6 are generally known as the Cayley—Klein parameters. It is 
easy to obtain their expression in terms of the Eulerian angles. For 
multiplication of the three unitary matrices (65) gives us, as we saw 
above, the unitary matrix which corresponds to the rotation with 
Eulerian angles {a, 8, y}. On carrying out the multiplication, we 
obtain the following expressions for the corresponding parameters 
a and Bb: 


sin (66) 


If 22 is added to a or y, a and b change sign whilst the rotation 
remains in essence the same. This matter has already been mentioned 
above. 


64, The general linear group and the Lorentz group. We have just 
established the close connection between the unitary group in two 
variables and the group of three-dimensional rotations. Similarly, 
@ connection can be established between the general linear group 
in two variables with unity determinants, and the Lorentz group. 


228 THE BASIO THEORY OF GROUPS AND LINBAR REPRESENTATIONS OF GROUPS [64 
We introduce four variables 
Ty, Lg, Lz, Ly 


and, on returning to expressions (51) for the stereographie projection, 
set in these 
— Mm. yu. yy 
Cae Uae 13 (67) 
This gives us the following expressions: 
L S90 Fh, my SEH 
tm % e+ 


These define the z, up to a common factor, and we can write 
a =e; t= int 


_ ie (68) 

a, = > (fn — £m); %y = €& — 7H. 

The previous variables satisfied relationship (51,), and hence, by 

(67), the new variables given by (68) will satisfy for any complex 
€ and 7 the expression 

at + 23+ ag— a3 =0. (69) 


In the case of a unitary transformation from & and 1, the expres- 
sion (€ -+ 17) remained invariable, i.e. by (68), the variable 2, 
here expressing time, remained invariable, and we thus obtained a 
rotation of three-dimensional space. We now abandon unitariness and 
take the general group of linear transformations 


& =af+ by; 7! = c& + dy. (70) 


We use a similar approach to that for unitary transformations. 


We form the expressions 
w+ ity = DE; 2, — tr, = 267; | ‘i 
Xy +45 == 2&5: Z— Wy = 277. { 


For the new variables &’, 7’ we obtain new 2%;: 


64] THE GENERAL LINEAR GROUP AND THE LORENTZ GROUP 229 
On substituting expressions (70) and using (71), we get 


a + i2y = Gd (2, + ix.) + be ( — ix.) + 
} Ge (ay + ay) + bd (x — 2) 
ai — ix = 6 (w, + tay) + ad (x, — ig) + 
+ aé (x + £3) + bd (2% — %5), (72) 
Zo + Xz = Gb (x, + t2) + ab (x, — ix,) + 
+ aii (y+ 25) + bb (2% — 25), 
2 — 2% = Cd (a + ix) + cd (a, — ix,) + 
+ 6 (%-+ 23) +dd (x9 — 2;), 


whence linear expressions which we shall omit may be obtained with 
real coefficients for the zj, in terms of the z,. We merely observe that, 
if the last two of equations (72) are added, the coefficient of z, in 
the expression for 4 turns out to be equal to 4/, (ad + bb + cc + dd), 
i.e. the coefficient is positive. 

The new variables satisfy a similar relationship to the old: 


ey? + apt ay — zy = 0. (73) 


If we replace the zj, on the left-hand side of this equation by their 
expressions in terms of the 2,, (69) must be obtained. But it is possible 
for the left-hand side of (73) to differ from the left-hand side of (69) 
by @ constant factor, ie. we have here 


P+ xy + ay — ay = k (xi + 23 + 23 — 29), 
where & is a constant. On using the above expressions and taking 
into account the fact that 
we fae + age — ag = (2 + tg) (y — 12) — (5 + 3) (2% — 74); 
it is easily shown that & = (ad — bc) (ad — 6c) = | ad — bc |?. If we 
want to have k = 1, i.e. the Lorentz transformation 
wept oP 4 ay — ag = zi + 23 + 33 — 3, (74) 


we have to take linear transformations (70) with determinants of unit 
modulus, i.e. of the form e’”. On multiplying all the coefficients of 
(70) by e-'?? as before, on the one hand, we do not change the 21, 73, 23 
defined by (68) with £ and 7 replaced by &’ and 7’, since expressions 


230 THE BASIC THEORY OF GROUPS AND LINBSE REPRESENTATIONS OF GROUPS [64 


(68) contain the product of one of the quantities (&’, 7’) with one of 
(¢’, 7’), and on the other hand, we reduce the determinant to unity. 
We therefore take transformations (70) as having unity determinant: 


ad —be = 1. (75) 


We can show as in the previous section that the linear transforma- 
tion giving the z; in terms of the z;, has a (+1) determinant. We recall 
furthermore that the coefficient of z, in the expression for 76 is positive 
here, i.e. the transformation has a (-++-1) determinant and does not 
change the direction for measuring time, or in other words, (72) yield 
positive Lorentz transformations. 

Tosum up, linear transformations under condition (75) give the posi- 
tive Lorentz transformations which we defined in [54]. 

As in the previous section, we now pose the question of whether 
any given positive Lorentz transformation can be obtained from (72).We 
remark first of all that, as in the previous section, corresponding to 
the product of two linear transformations (70) we have the product 
of the corresponding Lorentz transformations, or more precisely: if 
A and 8B are two linear transformations (70) which lead in accordance 
with (72) to Lorentz transformations T, and T,, corresponding to the 
linear transformation BA we have the Lorentz transformation 7, 7. 
As we saw in [54], every positive Lorentz transformation can be writ- 


ten in the form 
T=VS8U, 


where U and V are simple rotations of three-dimensional space and 
S is a positive Lorentz transformation in two variables. In accordance 
with the results of the previous section, we can obtain any rotation with 
the aid of a unitary transformation of type (70) with unity determinant. 
It thus remains for us to show that we can get any positive Lorentz 
transformation S in two variables from (72), given a suitable choice 
of linear transformation (70). On comparing (74) with (21) of [54], 
it will be seen that we now take c = 1, so that expressions (17) of 
[54] for the positive Lorentz transformations in two variables become 


a= ly tT. ty = 20s | 
; yl—« ’ yl—e (76) 
t= 2; 1 = 2. j 


65) REPRESENTATION OF 4 GROUP BY LINEAR TRANSFORMATIONS 231 


and take the particular form of transformation (70): 


’ pool 
f=; =>, 


where / is a real constant. The determinant of this is evidently unity. 
In the present case a = 1, d=1/l, and 6 =c=0. On making these 
substitutions in (72), we in fact obtain (76) if 2 satisfies the condi- 
tions: 


This at once gives us P= u+)u?—1. The second condition 
shows that we have to take the root less than unity for / with v > 0, 
and the root greater than unity with v <0, the second condition 
being thereby fulfilled. On extracting the root, we get two values of 
opposite sign for 7. We can say finally that the group of linear trans- 
formations (10) with determinant 1 is homomorphic with the group of 
positive Lorentz transformations, the homomorphism being in accordance 
with (72). As in the previous section, this homomorphism is not an 
isomorphism, i.e. different transformations (70) can lead to the same 
Lorentz transformation. It follows at once from (72) that the identity 
transformation in the Lorentz greup is obtained from the two linear 
transformations with matrices 


e=[29|, 
0,1 


and it can be shown precisely as in the previous section that any 
transformation of the Lorentz group can simply be obtained from two 
linear transformations (70) whose coefficients differ only in sign. 

As in [63], the elements Z and (—£) form a normal subgroup 
H of the group of linear transformations with determinant 1, and 
the group of positive Lorentz transformations is isomorphic with the 
group cemplementary to H. 

Linear transformations (70) contain four complex coefficients. 
related by condition (75). Expressions (72) thus contain three arbitrary 
complex parameters, or in other words, six arbitrary real parameters. 


S= —i, O 
0,—1 


a 


65. Representation of a group by linear transformations. Let @ be 
a group with elements G, and suppose that there is a definite matrix 


232 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [65 


A, corresponding to each G,, all the A, having the same order and 
non-zero determinants. Suppose further that the correspondence is 
such that the product A,, A,, of matrices A,, and A,, corresponds to 
the product G,, G,,. We say in this case that the matrices A, or the 
corresponding linear transformations give a linear representation of 
the group G. Let G, be the identity element of the group and 4, 
the corresponding matrix. Since G, G, = G,, we must have A, 4, = 
= A,, whence, on multiplying on the right by Az’, we have A, = I, 
i.e. the identity element must correspond to the unit matrix. Let G,, 
and G,, be inverse, and A,,, A,, the corresponding matrices. It follows 
from G,,°G,, = G, that A,,A4.,=J, i. inverse matrices cor- 
respond to inverse elements. An immediate consequence of the above 
is that the matrices A, (or the corresponding linear transformations ) 
form a group A homomorphic to group G. If distinct matrices correspond 
to distinct elements of G, A is isomorphic as well as homomorphic 
to G. It is said in this case to give a one-to-one linear representation of 
group G. 

If this is not the case, the set of elements of G, to which the unit 
matrix in A corresponds, forms a normal subgroup of G, and group 
A is isomorphic to the group complementary to this normal divisor 
[57]. 

If the basic group G is itself a group of linear transformations, a 
possible linear representation is yielded by the group itself. 

We notice one point in connection with the definition of linear 
representation. Suppose we know that to every element G, there 
corresponds a definite matrix A,, and that a product of matrices cor- 
responds to a product of elements, but that we are unaware of whether 
the determinants of the A, vanish or not. We show that, if one deter- 
minant D(A,,) vanishes, all the D(A,) vanish. Now the set of matrices 
A,, 4g with variable a contains all the matrices corresponding to 
elements of the group [56]. But D(A,, 4.) = D(A,,) D(A.) and the 
product vanishes since the first factor vanishes by hypothesis. Hence, 
given a correspondence in which products correspond to products, 
we only need to verify that one of the determinants D(A,) is non- 
zero; we only need to verify say that the unit matrix of A corresponds 
to the identity element of G. 

Let X be a matrix of the same order as the A, with a non-zero 
determinant. We have 


(XA,, X71) (XA,,X-) = XA, A,X; 


65] REPRESENTATION OP A GROUP BY LINEAR TRANSFORMATIONS 233 


and consequently the matrices XA, X— also give a linear representa- 
tion of the group G. Two such similar representations are generally 
described as equivalent. Let the order of the A, be n, and let (z,, ..., Zn) 
be vector components in n-dimensional space on which the transforma- 
tions A, are carried out, so that the group A becomes 


x = Ax. (77) 
As we know from [25], the equivalent linear representation 
y = XA,X7ly, (78) 


means that new axes are taken in the space, the new components being 
given in terms of the old by 


(Yy,- ++ Yn) = B (By, ~-- , Lp) (79) 


With these new axes, linear transformations of space are now ex- 
pressed by (78), i.e. the equivalent linear representations can be obtained 
by a simple change of the coordinate axes in accordance with (79). 
The variables (z,, ---, Zp) appearing in (77) are known as the objects 
of the linear representation. Passage to the equivalent linear representa- 
tion is thus equivalent to replacing the objects of the linear representa- 
tion by different objects with the aid of linear transformation (79) 
with non-zero determinant. 

Let matrices A, of order 7 give a linear representation of a group 
G, and let matrices B, of order m give another linear representation 
of the same group. We form the quasi-diagonal matrix of order 
(n + m): 

A,, 0 


0, B,|| sii 


[A., B, = 


We have by the rule for multiplying quasi-diagonal matrices: 
[4.» Bay] [4e) Boj =[4.4Aw 2B]. 
Thus matrices (80) also give a linear representation of G. In general, 


given representations of G by means of matrices A,, B,,C,, we can 
form a new representation by using the quasi-diagonal matrix 


| 4, 0, 0|| 
Di = [4a Ba C]=|}0, By 0], (81) 
lo, 0, C, || 


We now observe that, if we pass to an equivalent representation 
by matrices XD, X-1, the quasi-diagonal nature of the matrices is 


234 ‘THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [65 


in general destroyed, and we can no longer say at once from the form 
of this new representation that it is composed, up to an equivalent 
representation, of different representations with a smaller number of 
dimensions, in accordance with rule (81). If our linear representa- 
tion D, has the purely quasi-diagonal form (80), it breaks down 
into a number of linear representations A, and B, with a smaller 
number of dimensions, i.e. with matrices of a smaller order. The 
linear representation is said to be reduced in this case. If a linear 
representation #, does not have a quasi-diagonal form but some equi- 
valent representation XZ, X-1 has such a form, E, is said to be 
reducible. Finally, if neither the representation itself nor any representa- 
tion equivalent to it has the quasi-diagonal form, i.e. neither is 
reduced, the representation is said to be irreducible. 

We observe some conditions in which a representation can be said 
to be reduced. Let a linear representation consist of matrices A, of 
order n which yield linear transformations in the variables (2,, . . ., Zn). 
We suppose that all the A, are unitary, and that the subspace 2’, 
formed by the first 4 fundamental vectors, is transformed into itself 
by the A,, ie. if ty43 = Myye= --. = 2%, = 0, then also aj,,; = 


= Lie = ... = 2, = 0. In other words, all the A, have the form 
Au Na (82) 
| 0, Ag 


where A/ is of order k, AZ is of order (n — &), and the bottom left 
corner, with (n — k) rows and & columns, is filled with zeros. We 
consider the subspace &” formed by the last (n — k) fundamental 
vectors. It will consist of vectors orthogonal to all the vectors of 
the above R’. Since each A, transforms 2’ into itself, and being unitary, 
preserves the vector orthogonality,each vector of R” must become a 
vector likewise belonging to R” as a result of the transformation A,. 
In other words, if 7, = ... = 2% = 0, then also xj =... =“ = 0. 
It follows at once from this that all the elements in the top right corner 
of (82), with & rows and (n — &) columns, also vanish, i.e. the matrices 
of the linear representation in question are 

| Aj, 0 
| 0, Aa 


and the representation is consequently reduced. Now let all the unitary 
transformations A, leave completely invariant a subspace &, of k 
dimensions (k <n), where 7 is the order of the 4,. We transform the 


=[Aa, Aj], 


65] REPRESENTATION OF A GROUP BY LINEAR TRANSFORMATIONS 235 


coordinate axes in such a way that #, is formed by the first & funda- 
mental vectors, which is the same as passing to an equivalent linear 
representation and may be accomplished with the aid of a unitary 
transformation. By the above, the representation becomes reduced 
after this transformation. We thus have the following theorem: 

THeoreM. If a linear representation of a group consists of unitary 
matrices which leave some subspace unchanged, the representation is 
reducible. 

The reducibility or otherwise of a representation is closely bound 
up with the passage from matrices A, to the similar matrices XA, X-!. 
We shall notice some particular cases of passage to equivalent repre- 
sentations that may be obtained by a special choice of matrices X. 
We form a matrix X as follows: the first row has unity in the second 
place and zeros elsewhere, the second row has unity in the first place 
and zeros elsewhere, and from the third row onwards, we have unity 
on the principal diagonal and zeros elsewhere. Thus 


0, 1, 0, 0, , 0 
1, 0, 0, 0,..., 0 
X=|| 0, 0,1, 0,..., 0 I. 
0, 0,0, 1,..., 0 | 
0, 0, 0, 0, al 


We see on expanding directly, starting with the last row, that 
D(X) = —1. It may easily be verified by using the ordinary rules of 
matrix multiplication that, if Y is a given matrix, the similar matrix 
AY X~- is obtained from Y by interchanging its first and second rows 
and its first and second columns. In the same way, every interchange of 
rows accompanied by a similar interchange of columns is equivalent 
to passage to a similar matrix with the aid of the transformation X, 
which is clearly independent of Y. Hence, if we carry out the same 
interchange of rows and columns in all the matrices A, yielding a linear 
representation of a group, this is equivalent to passing to an equivalent 
representation. 

If there exists a distribution of the integers 1, 2, ..., into two 
classes such that, for each A,, a zero stands at the intersection of a 
row numbered by an integer of one class with a column numbered by 
an integer of the other class, the representation is reducible. For its 
reduction is simply accomplished by interchanging rows and columns 
in such a way that those numbered from one class always stand 


236 ‘THE BASIC THEORY OF GROUPS AND LINRAR REPRESENTATIONS OF GkOUPS [66 


at the top and to the left, whilst those numbered by the other class 
stand below and to the right. 

We conclude the present section by noticing the further case when 
the linear representation of a group G is of the first order, i.e. when 
all the A, are first order matrices, or in other words, ordinary numbers. 
In this case, to each group element G, there corresponds a trans- 
formation z’ = m, 2, i.e. simply the number m,, and the ordinary 
numerical product m,m, corresponds to the product G, G,. 


66. Basic theorems. Let @ be a finite group consisting of m elements 


G,, ..-> Gn, and let A,,..., Am be nth order matrices that give a 
linear representation of the group. We write the objects of this repre- 
sentation as x(%,, .--, Zn). We consider the expression 
m 
Pays -- + Bp) = S| Ax P- (83) 
s=1 


This has the expanded form 


e= 


m 
s=1 


n 
(aa, +... +a 2,) (@Dz,+.-.-+492,), (84) 
=I 


where we have written a& for the clements of matrix A,. We can 
easily show that (84) is an Hermitian form, i.e. the coefficients of 
Zp Tq and x, Z are complex conjugates. Furthermore, it follows from 
(83) that this Hermitian form represents the sum of the squares of 
the lengths of certain vectors, i.e. the form is positive definite [40]. 
In other words, on carrying out the unitary transformation 


y= Ux, 


reducing our form to a sum of squares: 
n — 
Pies > OT 
i=1 


all the coefficients 4; will be positive. On carrying out the further 
transformation z; = Vayyy we get an expression for the Hermitian 
form p as a sum of pure squares: 


=m... +3etn- (85) 
We subject the variables (z,, ..., Z,) to the transformation 


x’ = A, x, (86) 


66] BASIO THEOREMS 237 


belonging to the linear representation of our group. It may easily 
be seen that, with this, the form g remains unchanged. 
In fact: 


m 
(24, ---, Zn) = >” | As Ag xP. 


s=1 
But, as we know from [56], the set of transformations (matrices) 
A, Ax, A, Ay, caer) An Ax 


‘is the same as the set 
A,, Ay, .--, Ags 


therefore, if we express transformation (86) in new variables (z,, ..-., 
2n), related to the old by an expression of the form 


(24, sees 2n) = B(x, ee eg Ln) > 


where B, is a matrix, we get instead of the group A, the similar group 
B, A, B-!, and none of the transformations of this similar group 
will change (85), i.e. change the sum of the squares of the moduli; 
or in other words, all these transformations are unitary. We have thus 
shown that, for finite groups, every linear representation is equivalent 
to some unitary representation, ie. a representation consisting of 
unitary transformations. This property is preserved, with certain 
supplementary conditions, for linear representations of parametrically 
dependent infinite groups, and in future, when a linear representation 
of a group is mentioned, we shall always understand it to be unitary. 
We now have the following theorem. 

THeorem I. Every linear representation of a (finite) group has an 
equivalent unitary representation. 

We now find the necessary and sufficient condition for the reduci- 
bility of a linear representation. We first introduce a new term and 
call a diagonal matrix [%, ..., &] with the same elements on the dia- 
gonal a scalar matrix. Such a matriz may be written kI. As we have 
seen above, it is equivalent to the number & as regards algebraic 
operations. 

Suppose we are given a reducible linear representation of a group. 
The representation will be accomplished say by matrices of the form 


D,= X[{A, B, C,) X, 
where the interior matrix is quasi-diagonal and X is a given matrix. 


We form the matrix 
Y = X[KI, UW, mI] X-', 


238 THE BASIC THRORY OF GROUPS AND LINEAR REPRESENTATIONS OF GRouPS [66 


where the quasi-diagonal middle term has the same structure as in the 
D,. It may easily be seen that Y commutes with all the D,. For 


D,Y = X[A,k, B,l,C,m}] X, 
and similarly 


YD, = X[kA,, 1B, mC,| X~. 


But the order of the factors plays no part when a matrix is multi- 
plied by a number. Furthermore, if the numbers k,/, m are different, 
as we shall assume, Y is not a scalar matrix. For it clearly has dis- 
tinct characteristic roots k, 1, m. We thus arrive at the following 
theorem. 

TuHeorEM II. If a linear representation ts reducible, there exists a 
matrix differing from a scalar matrix that commutes with all the matrices 
appearing in the representation. 

We now show that the converse is also true, i.e. 

THrorem Il. Zf there exists a matrix Y which is not a scalar 
matrix and which commutes with all the matrices D, of a linear 
represeniation, the representation is reducible. 

We have for any subscript a, by the conditions of the theorem: 


D,Y = Y¥D,. (87) 


Let Z be a matrix with a non-zero determinant such that all the 
matrices ZD, Z~+ are unitary: ZD, Z~-1 = U,. We re-write the above 
equation as 


Z10,Z¥Y =YZ1U,2Z. 
We obtain by multiplying on the left by Z and on the right by Z-1: 
U,AZYZ-1) = (ZYZ-)U,, 


ie. ZY Z-4 commutes with all the matrices of the unitary representa- 
tion. This is clearly not a ecalar matrix since if ZYZ-1= Kl, 
we have Y = KI. It is sufficient for us to prove the reducibility of 
the equivalent linear representation, U,, and the proof of the theorem 
is thus reduced to the case when the representation is unitary. We 
simplify the writing by assuming that the linear representation con- 
sisting of matrices D, is itself unitary. 

Let A, be a characteristic root of matrix Y. We know from [25] 
that the matrix 4, = 4, J commutes with any matrix; consequently 
the matrix Y — 4, I as well as Y satisfies condition (87), i.e. com- 
mutes with all the D,. It is easily seen that at least one of the charac- 


66) BASIC THEOREMS 239 


teristic roots of Y,; = Y — A, Z vanishes. For, the characteristic equa- 
tion for Y, will be 


D(¥,—Al) = DIY —(4+4)=0, 


i.e. it is found from the characteristic equation for Y by replacing 
A by (A+ 4,), and since one of the characteristic roots of Y is Aj, 
at least one of the characteristic roots of Y, is zero. It follows that 
the determinant of the matrix Y,, equal to the product of the charac- 
teristic roots, also vanishes. We can thus assume in the proof of our 
theorem that all the D, are unitary and that the determinant of the 
matrix ¥ appearing in (87) is zero. 
We consider a set of vectors having the components 


Dy = Yy1 Uy TY + +e) + YinUn 
La = Yoy Uy 1 Yo2 Ug + +++ + YonUn (88) 


Ln = Ym + Y ne U2 + --+TYnntn> 


where the u, take any values and the y;, are elements of matrix Y. 
Since the determinant of Y vanishes, the rank of the array |] yj || 
is less than n. Let the rank be 7 < n. We know from [16] that in this 
case (88) define an r-dimensional subspace 2’. 

We consider the left-hand side of the equation 


D, Yu= YD,u, (89) 


The vector Yu has in fact the components (88), and D,Yu is 
therefore the result of applying the transformation D, to some arbit- 
rarily chosen vector of the subspace &’. We have on the right-hand 
side of (89) the result of applying the transformation Y to the vector 
D,u, ie. the components of the right-hand side are given by the 
same expressions (88) except that uw, ...,u, are replaced by the 
components of D,u, i.e. the right-hand side of (89) represents a 
vector belonging to the sub-space 2’. We see from this, on comparing 
the two sides, that the application of transformation D, to any vector 
of R’ yields a vector also belonging to R’. But we know from [65] 
that, if unitary transformations leave a subspace unchanged, they 
form a reducible representation. The theorem is thus proved. 

Theorems II and III show that the necessary and sufficient condition 
for the irreducibility of a linear representation is that there exists no 
matrix, not of the form kI, commuting with all the matrices appearing 
in the representation. 


240 THE BASIO THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS {67 


It follows at once from Theorem I that there is no need to mention 
the unitariness of the representation in the theorem of [65], and it 
can be generally stated that, if all the matrices of a representation leave 
some subspace unchanged, the representation is reducible. The con- 
verse is obvious. 


67. Abelian groups and representations of the first degree. A group 
is described as Abelian if any two of its elements commute, i.e. G,,G,, 
is equal to Gz,Ga, for any subscripts [56]. Let 4,,, Ag, be the matrices 
corresponding to G,, G,, in a linear representation. We have Ag, Ag, 
corresponding to the product G,, G,,, and similarly A,, Ag, to 
Ga, Gq,- But the products are the same and we must therefore have 


ApyAn=Ay Aa 


ie. any two of the matrices of a linear representation of an Abelian 
group commute. 

Suppose the representation is unitary, i.e. all the matrices are uni- 
tary. We know that there now exists a unitary transformation U such 
that all the matrices UA, U-' have a purely diagonal form [42]; 
thus there is an equivalent linear representation consisting of purely 
diagonal matrices 


UA,U~ = [k, ..., MP). 


It can be seen from this that the linear representation here decompo- 
ses into n first degree representations 


BO = 4@ (s =1,2,..., 2). 


To sum up, every unttary representation of an Abelian group is equi- 
valent to a set of first degree representations, passage to an equivalent 
representation being likewise accomplished with the aid of a unitary 
transformation. 

We now consider some examples, both of Abelian group represen- 
tations, and of first degree linear representations of non-Abelian groups. 


Ezample 1. We take as our first example the cyclic (Abelian) group of order 
m, consisting of the elements 


§° =1,S,S8%,...,8"™-1 (S™=D). (90) 


If the linear transformation z’ = wr, or what amounts to the same thing, 
the number » corresponds to the element S, we shall have the following numbers 


67] ABELIAN GROUPS AND REPRESENTATIONS OF THE FIRST DEGREE 241 


corresponding to elements (90): 
1, @, w,..., 971, 
Since S™ = I, we must have w™ = I, ice. 


al 
a=e ™ , 
where k is a positive integer that can clearly be set equal to any one of 0, 1; 
2,...,.m—1. 
We consider in detail the case m = 2. Here we have 


I,S and S=T, 


ie. S = S—1, With k = 0, the identity transformation x’ = z or the number 1 
corresponds to both the elements { and S; with k = 1, the transformation 
a’ = & corresponds to I, and 2’ = —z to S, or in other words, the number 1 
corresponds to J, and (—1) to S. An important case for physics is that when 
the group consists of the identity transformation of three-dimensional space 
and the symmetry transformation with respect to the origin: 


, ? 7 


yws=—ae yy z#=—2(S). 


We clearly have m = 2. The two representations above may be termed the 
identity and alternating representation of symmetry with respect to the origin. 

Ezample 2. We take the group of rotations about the z axis. The matrices 
of the group have the form 


cos p, — sing 


I 
I 


(91) 


sng, cosg 


and also satisfy, as we saw above, the obvious relationships 
29,29, = 29, 2q, = 2o,49- 


The function e also satisfies these relationships. But it must be noticed that, 
if p = 2x, the rotation is equivalent to the identity transformation, and we 
roust therefore have e2™ = J, i.e. the number Z must be of the form J = ki, 
where & is an integer. We thus have an infinite set of lincar representations 
of the rotation group, with the numbers 

ert 
corresponding to matrices (91). 
On assigning all possible values to the number &: 


k=0, +1, +2,.--; 


we get the infinite set of linear representations of the rotation group. 


242 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATION OP GROUPS [68 


Ezample 3. We now take the group consisting of the n! permutations of n 
elements. We can associate the number (-+-1) with each permutation, in which 
case we have what is known as the symmetric representation of the permutation 
group. Alternatively, we can associate the number (-+ 1) with even permutations, 
consisting of an even number of transpositions, and the number (— 1) with odd 
permutations. This gives us what is known as the anti-symmetric representation 
of the permutation group. In this representation, the number (+1) corresponds 
to each permutation of the alternating subgroup, and (—1) to the remaining 
permutations. It can be shown, though we shall not dwell on the proof, that 
the above two cases exhaust the possibilities as regards first degree linear 
representation of the permutation group. The group has other representations 
of higher degree than the first. 


Ezample 4, We next take the group of all the real orthogonal transform- 
ations on @ plane ie. the group consisting of rotations of the plane about. the 
origin combined with symmetrical reflection in the y axis. We saw above [52] 
that the matrices of this group are of the form 


, (92) 


sin 9, COs Y 


(e.2}=| dcosg, — dsing | 
| 


where d = 1 for a pure rotation and d = — 1 for rotation combined with reflec- 
tion. Apart from the obvious first degree linear representation in which each 
matrix (92) has the corresponding number (+-1), we can form a further first 
degree representation in which the number (+1) corresponds to matrix (92) 
if d = 1, and the number (— 1) corresponds to (92) if d = —1. This in fact gives 
us @ linear representation, since the product of two matrices of form (92) corre- 
sponds to a pure rotation if d has the same sign in both matrices, and to a 
rotation with reflection if d has different signs. 


68. Linear representations of the unitary group in two variables. 


We consider the linear representations of the unitary group in two 
variables. As we know, this group has the form 


where the complex numbers a and 6 must be subjected to the condition 
aa + bb=1. 


We form the (m-+ 1) quantities: 


=a; & =o 'a;...; 6, = - 


68] LINEAR RHPRESENTATIONS OF THE UNITARY GROUP INTWO VaBLaPLEs 243 


If we take &, = 2", 2’: and substitute for x, and 2 from (93), 
each & is clearly given linearly in terms of &, and hence we shall 
have for every transformation of group (93) a corresponding linear 
transformation from variables &, to &. It is obvious that products of 
transformations correspond to products of transformations, and we 
thus have a linear representation of group (93) of order (m+ 1). 
Though this representation may not be unitary, all we need do to get 
a unitary representation is to introduce an additional constant factor 
into each of variables (95), i.e. instead of (95), we define our variables 


by 


ee tails Sa | eT ae (96,) 
k Yon — E)IE! , gre: ee) 1 


and similarly 
zm k ak 


fe 05 Tyrese 6 


where we reckon 0! = 1 as usual. 
We show that our representation is unitary with this definition 
of the variables, i.e. 


m m 
> 1k = DS An: (97) 
k=0 k=0 


We have, in fact, on applying the binomial formula: 


m —f —im—k kk 

afm—k pim—k gk pik = — 
1S (a + ae) 
= asf (m — kik! 


and similarly 


m 
ml» > ny = (@E, + 2,)™. 
eo, te 171 2 a 
a 
But since transformation (93) is unitary, oft coe S 
1%; + Eon —= UX, + L_Xp. / ae i) = 
fo : a 


. 2 xX ms : oO 
and relationship (97) therefore holds. a ons ‘? vA 


244 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OP GROUPS [68 


We now introduce explicit expressions for the coefficients of our 
unitary representation of group (93). We make a slight change in 


the notation for this purpose and put 
alt! aj I 1 

—— 1 2? (J = — 5, —f + 1,...,9—1, 9). 98 
=WGLitGen | se j j) (98) 


In the previous notation m — 27, and the number j will be an integer 
if m is even, and half an odd integer if m is odd. Let m= 65, ex- 
pressions (98) give us the following six variables: 


n 5= TR n3= 0, nis 2ir3 
3 yi 3 yng’ -s~sOVarat’ 
ries Tir, G 


The variables here are enumerated, not by the first six integers, 
but by fractions that differ from each other by unity and run from 
—5/2) to (+5/2). If m = 4, we have by (98) the five variables 


pg Ee _ 23. _ ate 
2 ag? Taner? yep? 2 yar? 
_ 
My Yai * 


The variables here are enumerated by the integers from (—2) to 
(+2). For every fixed m = 2], we have precisely the same enumeration 
of rows and columns in the matrices which form the linear represen- 
tation of order (27 + 1) of group (93). 

We next turn our attention to finding the elements of these matrices. 
We have 

pate E (aa, + be, it! (— ba, + aa)it 
"~VEtnG—D! VG+DiG— Di 
and we want to write the right-hand side as a linear combination of 


the quantities 7. Application of the binomial formula gives us 


jt ce 6 vin fe!) oie 
. _— ])\i-he VG+D! G—D! 
> SI 2 RRTGLI—- EG —T— ke)! 


ai! qgitt-k py-l-w Bk Qik Ke aple ke 


If we reckon p! = co when 7 is a negative integer, we can take the 
summation with respect to k and k’ in the above expression from (— ©) 
to (++ °°), since the superfluous terms will contain an infinite factor 


68] LINEAR, REPRESENTATIONS OF THE UNITARY GROUP IN TWO VARIABLES 245 


in the denominator and therefore vanish. We replace k’ by a new 
variable of summation s = j7 — k — k’, so that the summation is over 
integral or half integral values from (—°°) to (+°%), depending on 
whether j is an integer or half an integer. We thus get 


' by YG+O!G—D! 
— oe RSP Dd NO 
m = & 1)" MG —k—siG+i—mete-m * 


x Girkes qitl—k pets—t pk alts gi-s , 
But we have by (98): 
ajtsaj-s = Vj + s)i(j —s)! m5, 


and we finally get the required linear relationship in the form 


VO+O!G—D!G +2)! G — 4)! 


vH =e ae (— 1s Gk eG 41—&! +s—m * 


x Gi—k—s giti-k Bets! bk Ns: 


Thus, having assigned a fixed j, the elements of the matrix of the 
linear transformation of order (27-1), corresponding to unitary 
transformation (93) with matrix 


a,b 
| — 8, & 


(99) 


G+YNG-YG+s!'G—s)! aks pitikpkes— 
ARG k= GSEs ee 


The subscripts J and s here run over the following values: 
Zand s=—j,—jtl,...,j—1,4, 


where the further reminder must be given that, if 7 is half an integer, 
we have an enumeration of rows and columns of the matrix likewise 
in half integers. On taking p! = © if p is a negative integer, we get 
the following limits of summation with respect to & in (99): 


k>0; ko>l—s; k<j—s; k<jt+l. (100) 


246 ‘THR BASIC THEORY OF GROUPS AND LINEAR REPRESHNTATIONS OF GEOURS [68 


We notice that (99) may be simplified by passing to a similar repre- 
sentation. Let A be a matrix with elements a,, and S the diagonal 
matrix [6,,..., dal. 

It is easily seen by using the ordinary multiplication rule that the 
matrix SAS- has the elements 


{SAS} pq = 8p 8pq 5g 7 - 
If we now apply this rule to matrices 


a, b 
PAS a} 
and take 6, — (—1)*, the factor (—1)*— goes out in (99); in fact, we 
shall assume below that this factor is absent. 

We next turn to the proof that the linear representation of unitary 
group (93) defined by matrices with elements (99) is irreducible. We 
prove two preliminary lemmas. 

Leama I. If a diagonal matrix, of which no two of the diagonal elements 
are the same, commutes with a matrix A, A ts also diagonal. 

We have by hypothesis: 


Af6 iO (=p OA; 


where no two of the 6, are the same. Let apg be the elements of A. 
Using the multiplication rule, we get from the above: 


Ong 8g = 85 Enq OF By(5, —5,) = 0, 


and consequently ap, = 0 if p ¥ q, i.e. A is in fact a diagonal matrix. 

Lenora IT. If a diagonal matriz [b,, ..., 6] commutes with a matrix A 
in which at least one column contains no zeros, we have 6, = ... = bn 

On interchanging rows and columns, i.e. passing to similar matrices, 
we can bring the column with no zeros into the first place. With this, 
the diagonal matrix still remains diagonal, and the matrices commute 
as before. We can thus suppose, writing dp, for the elements of A, that 


4,49 (¢=1, 2,.-.-., n), 
and moreover, by hypothesis, as above: 
(6, — 6;) = 0 (¢=1,2,...,n), 


whence we have 6, = ... = dn, and the lemma is proved. 


68] LINEAR REPRESENTATIONS OP THE UNITARY GROUP IN TWO VABIABLES 247 


We now prove the irreducibility of the linear representation defined 
by matrices (99). Let Y be a matrix of order (27 + 1) which commutes 
with all the matrices 

D,| gale 


ba] 
obtained with differing a and 6 satisfying condition (94). To prove 
irreducibility, we want to show that Y must be a scalar matrix. 
We first take the case when b = 0 and a = e. These complex numbers 
clearly satisfy (94). 
Using (99), we first of all find that now 


fa 
D; 3 e 7 = 0 l x 8. 
0, e* is 


whilst the diagonal elements are 


Do" ge | oem (I=—j, —74L-..,9-L9, 
and the matrix has the form 
ne“ Bie, 0, 0, See 
v,{eoa}=le iene cr oa (101) 
DED Aw ae 2a he Snes alae chante erage 
lo, 0, 0, | 


i.e. given a suitable choice of a, we have a diagonal matrix with diffe- 
rent elements on the principal diagonal. We can say, by using the first 
lemma, that the matrix Y that has to commute with matrices (101) 
must also be diagonal, i.e. 


Y =[6,,---, d,]- (102) 
We now take the case when both numbers a and 8 differ from zero, 


and consider the first column of the matrix D, 


a, 6 
| ei ‘I. Its elements 


—b,a 
are given by (99) on setting s = —j. Inequalities (100) now give us 


whence it is clear that the entire sum appearing in (99) reduces to a 
single term, which is obtained with k = j + land evidently differs from 


, 0b 
zero. Thus the first column of the matrix D,{ = does not in fact 
— 6, a 


248 THE BASIC THEORY OF GROUPS AND LINEAR REPRRSESTATIONS OF GROUPS [68 


contain zero. But since the diagonal matrix (102) must commute with 

this matrix, all the 6, must be the same by Lemma II, ie. Y is a 
, 6 

scalar matrix. The matrices »,{ ; :} therefore in fact yield 
—b, a 

an irreducible linear representation of unitary group (93). On assigning 


to j the series of values 


j=05, 1,5.2,... 
we get an infinite set of these linear representations. With 7 = 0 we 
get the trivial identity representation, when the number unity cor- 
responds to every element of group (93). We now consider, with j > 0, 


to what transformation of group (93) there corresponds the identity 
—b,a 

defined by the equations 7j = 7 or, what amounts to the same thing, 

by the equations 


, 6 
transformation of the representation group D,| a \ which is 


(aar, ++ bag) 4*#(— ba, + aa)/—! =azittg-t 
(= —j —j,t+ 1, ace j—1, j). 
(ax, -- bay)! = at, 


whence it follows that 6 =0, and the previous equations may be 
written as 


Setting 7 = 1, we get 


alt ait alt af! = aft al! (@=—j, —j7+1,...,j—-1)9), 


whence a/*'gi-' — 1. But |a| = 1 fordb=0, and the last equation 
may be rewritten 
elo (=—j, —ftl,...,f—-L f- 


If j is half an odd integer, we can put 1 = 1/2 which gives a = 1. Ifjis 
an integer, the equations a4 = 1 reduce simply to a? = 1, whence 
a= +i. 
Hence, if j is half an odd integer, the identity transformation in the 
a, b 
group D | Raz merely corresponds to the identity transformation 
—b, a 
, a) 
in group (98), i.e. in this case D,{ ; — } is a one-to-one representa- 
—b,a 
tion of group (93). Whereas if j is an integer, to the identity transfor- 


a,b : 
mation in the D, ; ‘| group there correspond two transformations 
—d,a 


69] LINEAR REPRESENTATIONS OF THE ROTATION GROUP 249 


in group (93) with matrices 


=—E. 


Z| 1, 0 
| 0, 1 


s=|—) 0 
| 0, 1] 


These transformations form a second order cyclic group, and the 


6,a 
group [58]. We can say alternatively that, to every transformation in 


a,b 
D,| are a one-to-one representation of the complementary 


a) 
the D; ¥ ; representation with integral 7 there correspond two 


transformations of group (93), for which the numbers a and b differ 
only in sign. 


69. Linear representations of the rotation group. The above results 
are of particular importance since the unitary group (93) is closely 
bound up with the group of rotations of three-dimensional space; 
in fact, we can use these results to obtain an irreducible linear repre- 
sentation of the rotation group. 

We have a definite rotation corresponding to every unitary trans- 
formation (93), whilst a simultaneous change of sign of a and b gives 
@ unitary transformation to which the same rotation corresponds. The 
parameters @ and 6 are related to the Eulerian angles of the correspond- 
ing rotation by the expressions [63]: 


-tigie) Zea 


a=e cos = B; b= —ie* 


5 sintB. (108) 


We first take the case when j is an integer. Expressions (99) show us 
here that a simultaneous change of sign of a and b does not alter the 
terms on the right-hand side, since the sum of the exponents of a, a, 
b and 6 is equal to the even number 2j. Thus the same matrix in the 
linear representation corresponds here to the two unitary transfor- 
mations that yield the same rotation. In other words, to each rotation 
with Eulerian angles {a, 8, y} there corresponds, with integral 7, 4 


a,b 
definite matrix in the linear representation D,. Instead of D, ; i : 
we shall now write for this matrix noah 


Da, B, y} : (104) 


If j is half an odd integer, simultaneous change of sign of a and b 
leads to a change of sign of all expressions (99), ic. we have here, 


250 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS (69 


corresponding to the unitary transformations that lead to the same 
motion, different matrices in which all the elements are of different 
sign. Two matrices of different sign also correspond to each rotation, 
i.e. we now have to put signs in front of the D, in (104). To sum up, 
matrices (104) give us a linear representation of the rotation group 
when j is an integer. When j is half an odd integer, we do not strictly 
speaking obtain a linear representation, but have what is termed a 
two-valued linear representation. 

To obtain expressions for the elements of matrices (104), all we 
need do is substitute from (103) for a and b in (99). We obtain, on 
first neglecting the factor (—1)*’: 


Dia G+ G— DG +H G— a 


Ae Bs rhs =" S(— Wage aigsic meds —m * 


ae oe 1 : 1 - 
x eile —isr Egg? seer B sin2*+s—! a B. (105) 


If we take advantage of passage to the equivalent representation 
with the aid of the matrix 


it becomes a question of interchanging rows and columns in the 
reverse order, i.e. of replacing the / and s by (—/) and (—s). We can 
thus write, instead of (105), the new expressions 


, sp VG LOL —OEG + 8)! G — 8)! 
Dia. B vhs = 8" S(— I pGoet erga eae % 


5 a: 1 2 ok sa 
X elle—isy eogtf—I-2kts 2B ginth—s — B, (106) 


the factor i'* being neglected by using the same arguments as in [68]. 
We note some elementary particular cases. With j = 0, we have 
the first degree linear representation 


y=. 

This is the trivial identity representation. With j = 1/2, we have 
27 + 1 = 2, and the quantities 7_, and 7, are simply equal to 2, 
and z,, i.e. unitary group (93) is here its own special linear represen- 
tation (apart from possible interchange of rows and columns). 


69} LINEAR REPRESENTATIONS OF THE ROTATION GROUP 251 


We obtain for the rotation group a two-valued representation of 
the second degree, defined by the matrices 


> ives) + (y—-2) ! 
| cos 5 8, ie? sind | 
D, {a, B, 7} = | 1. 1, 
z Pas es ke: Sa | = t—s) 1 
i ie: sin = B,e cos > B 


With 7 = 1, we have a linear representation of the third degree: 


| e-tortay 1+ cos . eee sme. go Joe special | 
a! 2 
| ; a 
, | ely sin B iy 
—lie : cos B, —e 
Di{a, B, sl y2 B 

| eH) l1—cos f eit sin B ally) 1+ cos 8 

| y2 ° 2 


The linear representations D; {a, 8, y} with integral j give one-to- 
one representations of the rotation group. This follows at once from’ 
the fact that two matrices of group (93) correspond to each D; {a, 8, y}, 
where the matrices differ only in the signs of a and 6 and, as mentioned 
above, correspond with the same rotation. If j is half an odd integer, 
to each rotation there correspond two matrices of the D, {a, 8, y} 
representation, differing only in sign. In particular, matrices + ¥ of 
the D, {a, 8, y} correspond to the identity transformation of the 
ration group, where £# is the unit matrix of order (27 + 1). If we 
confine ourselves to transformations of D, {a, 8, y} sufficiently close to 
the identity transformation, the D, {a, B, ¥} are single-valued represen- 
tations of the rotation group. In this case, it is sufficient to confine our- 
selves to values of a, 8, y, near enough to zero in the general expressions 
(106). But if we add 2z to a or y, all the matrices of D, {a, 8, y} change 
sign in view of the fact that s and / are half odd numbers, and we get 
a second representation of essentially the same rotation. We show 
later that the representations mentioned are all the isomorphic irreducible 
representations of the rotation group. 

Since the Dj; {a, 8, y} are all irreducible representations of the rota- 
tion group, the matrix D; {a, 8, y} must be similar to D{a, 8, y}, cor- 
responding to the rotation of space with Eulerian angles {a, 8, y}. 
We saw in [63] that D = Z, X, Z,, and on carrying out the matrix 


252 THE RASIC THEORY OF GROUPS AND LINEAR BEPRESENTATIONS OF GROUPS [70 


multiplication, we get 
cos a cosy — sina cos sin y, —cosasiny —sinacosfcosy, sinasinf 


D=|sinacosy+cosacos # sin y, — sincsiny + cosacos cosy, — cosa cos Bl, 
sin B sin y, sin B cos y, cos B 


and it may easily be verified that 


AD; {a, B, y} A-1 = Dia, 8, y}, 


1, 0, 1 
A=||t, 0, —7 
0, Y2i OO 
70. The theorem on the simplicity of the rotation group. We show next 
that the rotation groups simple, i.e. it has no normal subgroups [58]. 
If there were such a subgroup, it would follow from what was said in 
[63] that there exists a corresponding normal subgroup of group G 
of transformations (57) with unity determinant, differing from the 
normal subgroup H consisting of # and (—E). We therefore want to 
show that group G has no normal subgroup other than H, i.e. that if a 
normal subgroup H, of G contains a matrix A different from EH and 
(—£), H, coincides with G. We observe first of all that, if H, contains 
a matrix B, it follows by the definition of normal divisor that H, 
contains all the matrices U-1 BU, where U is any matrix of group G. 
We can thus obtain by suitable choice of U any matrix of G having 
the same characteristic roots as B. Hence to show that H, is the same 
as G, we only need to show that H, contains a matrix with any per- 
missible characteristic roots. These roots must have the form e and 
e~'”, where w is a real number, since the matrix is unitary and its 
determinant is equal to unity. 
By what has been said, we can take U1 AU instead of the matrix 
A and it can therefore be assumed that A is diagonal. 
Let it be given then that H, contains A = [e’, e-'*], where ¢ is 
real, and e” # +1. With this, A? = [e-”, e'?]. We take the arbitrary 
matrix of group G: 


where 


U ed z y¥=1 
Wag ae (zz + yy = 1). 
Now, 
Gee 
y, « 


71] LAPLAGE’S EQUATION AND LINEAR REPRESENTATIONS OF THE ROTATION GROUP 253 


Since the subgroup H, contains A and is a normal divisor, it must 
also contain the matrix 


Y = AU 4-1U-1). 


On carrying out the matrix multiplication and taking into account 
that cz + y7 = 1, we get the following expression for the trace s of Y: 


8 = 2 — 4yy sin? yp = 2 — 49? sin’, 


where 9 = | y| can take any value from the interval 0 < 9 < 1, and 
sin g¢ # 0. The characteristic roots of Y, (e, e—*), are roots of the 
equation ie 


#B—sh+1=0, ie. A+ (4p2sin?g —2)A+1=0. 


As 9 varies from 9 = 0 to 0 = 1, a runs from a= 0 to a = 29. We 
introduce the following notation: 


Uz a [e®, e—#] f 


It follows from the above remarks that H, contains all the matrices 
H, with 0 <a < 2p. It is now easy to show that H, contains any 
matrix U, (8 >0). For on choosing a positive integral n such that 
0 < p/n < 2¢, H, will contain U,,,, and will therefore also contain 
OF =U,. 
n 
Hence H, contains matrices with any characteristic roots and by what 
we have said above, must coincide with G. We have thus shown that 
the rotation group is simple. 

It follows at once from this that the rotation group cannot have 
homomorphie (as distinct from isomorphic) representations. For if 
there were such a representation, to the identity transformation in the 
representation group there would have to correspond in the rotation 
group transformations forming a normal subgroup, whereas no normal 
subgroup exists by the above. 


71. Laplace’s equation and linear representations of the rotation 
group. We sball next indicate the connection between linear repre- 
sentations of groups and differential equations. This connection lies at 
the basis of the application of linear representations to problems of 
modern physics. We shall start with the elementary case of Laplace’s 
equation [II, 92]: whilst giving us nothing new, this will throw some 
light on the subject as a whole. We first establish some general facts 


254 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS. [71 


that play an important part in problems of linear representations of 
groups; these generalizations are indeed already familiar to us in 
certain particular cases from the abeve examples. 

Let G, whose linear representations we wish to form, be a group of 
linear transformations of order 7: 

y= GiPa+... +92, (k=1,2,...,2), (107) 
where the superscript a, characterizing an element of G, runs over a 
finite or infinite set of values. Further, let m functions exist: 

Y5(L1s ++ +1 Sp) = 1.23.54.) (108) 
such that they also undergo linear transformation on substituting for 
the independent variables in accordance with (107): 

G(T, ---, a) =AG yy (2, ---, Za) + --- TAG (a, -.-, Zp) (109) 
(s =1,2,..., m). 

We have here a matrix A, with elements a corresponding to 
transformation (107) of group G. We consider two transformations of 
the group: 

(2, Soars Ln) = Gua(%1, sees n) 3 (zt, Seer, In) = Goo(Z1, ca Tn); 
Gis = Gan Gy - 


The corresponding transformations of functions (108) are 


P(X» ---» Xn) =AQV (My, ---, Zt --- + af» PmiXy, +--+, Z_) (110,) 
and 
Ps(Zp ---» Zn) = AG Y (2, ---.2n) + --- + GP ppl}, ---, Zn). (1104) 


On substituting for the g,(z}, ---, 2%) from (110,) in (110,), we get 
a direct relationship for 9,(z{, ..-, Z;) in terms of 9(7, ---, Zn) yield- 
ing a matrix A,,. We thus obtain 


m 
{Aa,} ix Eo = afg2) aGy, 1.8. Bas = As A,, > 
$=1 


and expressions (109) evidently define a linear representation of 
order m of group G. We have assumed in the above arguments that 
the functions ¢, are linearly independent. In this case linear transfor- 
mations (109) are uniquely defined and D(A,) ¥ 0, since otherwise 
the ¢,(z}, ---, z,) would be connected by a linear relationship. 


71] LAPLACE'S BQUATION AND LINEAR REPRESENTATIONS OF THE ROTATION GROUP 255 


In the particular case of constructing linear representations of a 
unitary group, the role of functions ¢, was played by functions (96,). 

Let G be the group of rotations of three-dimensional space, so that 
n = 8, and let the ¢, be orthogonal and normalized in a sphere K 
with centre at the origin, i.e. 


Ss i P p(X 1, Lo, Xa) Fol Xs, Lo, £5) dz, dr, dr, = 5 ng . (111) 


We show that linear representation (109) of the rotation group is 
now unitary. The sphere K is displaced into itself as a result of a 
rotation G,, and the determinant of G, is known to be equal to unity. 
Condition (111) thus gives us 


SSS pai, 24, 25) G_(@4, HS, %) dex} day day = 6, 
K 
2 


or by (109): 
m mo 
GIS LSP viley 22,22). SOD 9), Xa, Xp) ] dr; dey des = 6,,. 
K f=1 j=l 


By the rule for change of variables in a triple integral, on passing to 
(x1, Z, 23) we have simply to substitute dz, dz, dz, for dx; dz dzg then 
integrate over the same sphere K. We get by (111): 


m —— 
SQA =o, wg =1.2,---4m), 
t=1 


where, as usual, 6,, = 0 for p # q and 6p, = 1, i.e. each of the matrices 
A ig here orthogonal by rows, whilst the transposed matrices will be 
orthogonal by columns and consequently by rows also [28]; it follows 
that a basic matrix has orthogonality both of rows and columns, or 
in other words, the A, are in fact unitary matrices for any a. 
We now consider the Laplace equation in two variables 
eu, FU 
a 


—0 (112) 


or, in vector notation, 
div grad U = 0. (113) 
We take the homogeneous polynomial in z and y of degree 1: 
Gy (&, Y) = aga! + ae ty +... tayzFyt +...+a,y'. (114) 


We show that there exist two linearly independent polynomials of 
type (114) that are solutions of equation (112), and that every solution 


256 THE BASIO THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS § [71 


of (112) that is a homogeneous poynomial of degree / must be a linear 
combination of the above two polynomials with constant coefficients. 
In fact, the coefficients of polynomial (114) are given by 


1 Op, (x, ¥) 


= TBE az F aye © 
But since this polynomial must satisfy equation (112), we can re- 
place double differentiation with respect to y by double differentiation 
with respect to z whilst at the same time changing the sign, inasmuch 
as (112) can be written as 


eu FU 
oy? da? 
We thus obtain for the coefficients a, expressions of the form 


=i To he ee OF % > UI aay’ 


i.e. all the coefficients of polynomial (114) are expressible in terms of 
coefficients a, and a,. This argument shows us that there exist no more 
than two linearly independent homogeneous polynomials satisfying 
equation (112). We now show that two such distinct polynomials in 
fact exist. For this, we consider the homogeneous polynomial 


@, (z, y) = (2 + wy)!. 
After removing the brackets and separating real and imaginary 
parts, we get 
@,(, y) = (zy) + 29, (zy), 


where 9,(z, y) and »,(z, y) are real linearly independent homogeneous 
polynomials of degree 1. We get by differentiating w,(z, y): 


om ; 3 
FOE Y) TL — 1) (@ + ty); 


FOE © 1 —1) (e+ iy), 

Le. w,(z, y) satisfies equation (112). The same can therefore be said 
of the real and imaginary parts of this function, i.e. of polynomials 
gi(z, y) and yz, y), and these give us the two required solutions of 
(112). We introduce polar coordinates 


Z=Tcosg; y=rsing, 
whence 
w, (x, y) = rele, 


71] LAPLAGE'S EQUATION AND LINEAR REPRESENTATIONS OF THE ROTATION GROUP 257 
Polynomials g, and »,; now take the very simple forms 
(2, y) =reoslp; yy (z,y) =7'sinlp. 
We rotate the zy plane about the origin by the angle: 
2’ =xcosé—ysind, y =xsind+ yoos?. (115) 


It is easily seen that equation (112) now remains invariant, or more 
precisely, it looks exactly the same in the new variables: 
eu , #U 
wet toy 
This can be verified directly by using (115) and therule for differen- 
tiating a function of a function. Or it follows directly from the fact 
that the left-hand side of equation (113) has a definite value indepen- 
dently of the axes chosen, so that it has the same form for any Cartesian 
axes. The polynomials ¢;(z’, y’), p(x’, y’) must satisfy equation (116), 
and consequently also (112), and must therefore be linearly expressible 
in terms of ¢;{z, y) and »,(z, y). This in fact gives us a linear represen- 
tation of the group of rotations on a plane. 
We now take two different polynomials that are linear combinations 
of the above: 


=0. (116) 


Pi (2, ¥) = 9 (2, Y) — 7M, (Z,4)5 (ZY) =H (%,Y) + 9, (@Y) 
or 
P1(%, y) = (x — tyre; yi (x,y) = (x + ty)! = re”. 
These polynomials yield the following transformation: 
1 (z’, y') = re“) — e- oF (x,y), 
v1 (2, y’) = ret +) = ell yi (x,y), 


ie. the matrix 
e880 
0, elle 


corresponds in the linear representation to transformation (115), where 
the angle # can have any value. The form of the matrix implies at once 
that the linear representation has a reduced form. It gives two linear 
representations of the first order, defined by the numbers e~“” and a: 
The integer 7 can take any value throughout the above discussion. 
We have obtained in this way the same linear representations of the 


rotation group for a plane as were obtained earlier in [69]. 


258 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [71 


We now turn to the Laplace equation in three variables: 


eU eo fU 
Oye =e O22 =0 (117) 


or 

div grad U =0. (118) 
“\ We consider homogeneous polynomials of degree 1, now in three 
variables: 


P(x, y, 2) =e +X, (z,y) 214+ X, (zy) 2? 4+...+ 
+ X,1 (z,y) 2+ X,(z,y), (119) 


where the X,(z, y) are homogeneous polynomials of degree & in their 
arguments. Each X,(z, y) contains (k -+ 1) arbitrary coefficients, so 
that in general the polynomial 9,(z, y,z) of degree J in three variables 
will contain the following number of arbitrary coefficients: 
1-2434...4 04) = Seer? | 

On substituting (119) in equation (117), we get a homogeneous poly- 
nomial of degree (J — 2) on the left, and on equating its coefficients 
to zero, we get 1(J — 1)/2 homogeneous equations in the (1-+- 1)(/+- 2)/2 
unknown coefficients of 9;(z, y, z). We have: 


@+ic+2)  @¢-Y)l_ 
Sone St, 


so that at least (21+ 1) coefficients in ¢,(z, y, z) remain arbitrary, 
i.e. there will exist at least (2/ + 1) linearly independent homogeneous 
polynomials of degree / satisfying equation (117). By using the same 
method as for two variables, it can be shown that there are not more 
than (27+ 1) of these polynomials, i.e. there are precisely (2/ + 1). 
We write these polynomials as 


po (x, y, 2) (¢=1, 2, -..., 214-1). 
If 
(x, y’, 2’) = U- (2, y, 2) 


is a rotation of three-dimensional space about the origin, equation 
(117) will meantime remain invariant, and polynomials y{"(z, y, z) 
give a linear representation of order (21 + 1) of the group of rotations 
-of three-dimensional space. 

We give later a detailed treatment of these harmonic polynomials 
and introduce explicit expressions for them. We shall see that they can 
always be chosen so as to be orthogonal and normalized in any sphere 


72) DIRECT MATRIX PRODUCTS 259 


with centre at the origin. The linear representation of the rotation 
group that they then afford is unitary. This representation can be 
shown to be in fact equivalent to the representation D,{a, 8, y} which 
we constructed in [69]. We shall return to this problem later. 


72, Direct matrix products. Suppose we have two matrices 


|] @ias Faas ++ +9 Gan Oy, Oy, ---1 Pam | 
: wi { 
by, 5 Da35 - +s Bam |. (120) 


A = |@2v Fae» +++14oniand B= 
fee ae SS Se EE see ay : iy ee ws ie let oat TS ter Ne 7 
i j : H 
| On Eng, -- | f Bint» Om3s .. 1 i! 
the first being of order n and the second of order m. We form a new 
matrix C, whose elements cj ,; are obtained by multiplying each ele- 


ment of A by each element of B: 
{Ch ij510 = Cij- nt = Bix bit- (121) 


Here the set of two integers (i, j) plays the role of first subscript, 
and the set of integers (k, 1) that of second subscript, where 


iand k=1, 2, ..., 2; 


jand /=1], 2, ..., m. 


In other words, we have a special method of reference to rows and 
columns, in which they are indicated by a set of two integers, the first 
taking values from 1 to n and the second from 1 to m. We can naturally 
enumerate the rows and columns in the ordinary way by simple inte- 
gers which go from 1 to nm, with one such definite integer correspond- 
ing to each pair of (7, 7) or (%, 1), the integers being the same if the pairs 
are the same. Various different methods can be used for the enume- 
ration by single integers. Passing from one method to another amounts 
to a simultaneous interchange of rows and columns, i.e. to passage 
to a similar matrix which will later have no significance. 

The matrix C is called the direct product of matrices A and B, and is 
generally written as 

C=AXB. (122) 


The order of the factors is of no significance in this new type of 
product. 

Suppose, for instance, that both matrices (120) are of the second 
order. Their direct product is now a matrix of the fourth order which 


260 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [72 


we can write say in the form 


Lg 
i 15 @yy Dyy, Qyy Dyy, Ayy B43, Fyy yo |; | | ox; aa» ©4375 129 Cry; 249 ©11; 29 | 
pea Ory B94, Gz Oy2, Fy2 bai, B12 Oye), ; =| Cy2, 41> ©19; 129 12; 219 C19; 22 | 
| G21 111 221 By2 Bap byy, On9 Byo |. Corsux» 21; 12» 21; 219 ©21, 22! 


[| a2 B a1» 201 Oy, Ape Boy, Dap Bog | | Cop: 119 C29; 12> C22; 217 P22; 22 | 


or in an alternative form, given a simultaneous interchange of rows 
and columns. 
Let A and B be diagonal matrices: 


Awe[yy +--+) Mls B=[Spy --+> dq]- 

In this case a;, = 0 and b = 0 for? 4 k and j # 1, and consequently, 
by (121), cy.44 only differs from zero when the pair (?, 7) is the same as 
(k,l), i.e. the matrix C will also be diagonal. The principal diagonal 
contains all the possible products of the y, with the 6,. If all the », and 
6; are unity, Cis also a unit matrix. We thus have the following theorem. 

THeorem I. The direct product of two diagonal matrices is a diagonal 
matriz, and the dtrect product of two unit matrices is a unit matrix. 

We also prove the following theorem. 

Taeorem I. If A and A™ are two matrices of the same order n 
and BY, B® are matrices of the same order m, the following formula 
is valid: 

( A>» B®) ( AD» B®) = 4®@ 4My BO) B®, (123) 

It should be noticed that when we write two matrices of the same 
order after each other with no sign, this always means the ordinary 
product of the matrices. Denoting the elements by the corresponding 
small letters with two subscripts, we have by definition of direct 
product: 

{4Ox BO} >= af? bY” (é =I, 2); 


and on using the ordinary matrix multiplication rule, we get the fol- 
lowing expression for an element of the left-hand side of (123): 


nm om 
dis. x = > > a 6D a EY. (124) 
pl = 


We show that the same expression is obtained for the corresponding 
element of the right-hand side. We have by the definition of ordinary 
product: 

nn m™m 
{ A® AM} 5, => a®) af) ; { B® Bo}, = > bY. 
p=1 q=1 


72] DIRECT MATRIX PRODUCTS 261 


and by definition of direct product: 


du = Sapay. Sop op 
p=! gq=1 
which is the same as (124). We now turn to the proof of a final theorem 
regarding direct products. 
TueoreM III. ff A and B are unitary matrices, their direct product 
C=A x B is also unitary. 


We have by hypothesis: 
Sey Tag = Spqi > Dep Bag = Spa (125) 
i 


We verify that the columns of C are orthogonal and normalized, 
and write 


nmoem 

SNe... d. ie 

= Cijs pags Sif; pga = Opi ans pa ae” 
jaja 


ie. by (121): 
nom = n _ om = 
Spx qu; page = > a> Fipr Zips Ojon Bigs = 2 ips Vine a bin Ojg,- (126) 
im. i= 


If the pairs of numbers (p,, g,) and (p,, g,) are different, at least one 
of the factors on the right-hand side of (126) is zero, whilst if the pairs 
are the same, both factors are unity by (125). Hence dp.9.p.¢, is zero 
if the pairs are different, and unity if the pairs are the same, which 
proves the theorem. 

We can clearly form the direct product of three matrices by forming 
the direct product of the direct product of the first two with the third: 


ADx 4®»~x AS), 


On retaining the previous notation, we have the following for an 
element of the new matrix: 


at 1 2. 
Curr keye = af) af af. 


The direct product of any finite number of matrices may be formed 
in a similar manner, the order of the matrix representing the direct 
product being the product of the orders of the matrix factors. The 
order of the factors is of no significance. 


262 THE BASIC THEORY OF GROUPS AND LINEAR REPRESRNTATIONS OF GROUPS [73 


73. The composition of two linear representations of a group. Suppose 
we have two linear representations of a given group G with elements 
G2: 


zj=a@a,+...taMz, (¢=1,2, .-.,2) (127) 
and 
y,= Oy, +...402y, (k=1,2,...,m) (128) 


where the superscript a runs over a finite or infinite set of values. We 
write A4@ and B® for the matrices of transformations (127) and (128) 
and form their direct prouct: 


Co = AM BO. (129) 


The matrices C also give a linear representation of the group G. 
For to any element G, of G there corresponds a matrix C“; and the 
product G,, Ga, = Ga, has the corresponding matrix C°? 0° which 
is given, by (123), by 


C2) C@d — ( A2) x B)) ( AG x Bev) — ( Ala) A@)) x (Bo» Bev) : 


But since matrices 4@ and B® give linear representations of the 
group, we have 


Als) 4@) — As) and Bes) Bad — BO), 


and consequently: 
C@) CleD — A) yx BOCs), 
ie. by (129): 
Cl) Cle) — CO), 


Thus to a product of elements G, corresponds the product of cor- 
responding matrices C, and these matrices give a new linear repre- 
sentation of G. We notice that we now have the direct product of unit 
matrices A@ and B®, ie. a unit matrix C™, corresponding to the 
identity element of group G. 

We form the nm products x;y; and subject each of the factors to 
transformations (127) and (128). We have 


Ye = (AP 2+... + af tp) (HD yy + --- + OL Ym): 
or on removing the brackets: 
n 
1y.= >> SP pq ZY where oP, — add, 
pot gat 


ie. if 2; and y, are objects in linear representations defined by matrices 
A and B®, then xy, are objects in the linear representation defined by 


73] THB COMPOSITION OF TWO LINEAR REPRESENTATIONS OF 4 GROUP 263 


matrices C™. If the A® and B™ give irreducible linear representations, 
the representation given by the C is not necessarily irreducible. We 
treat later the case when Gis the group of rotations of three-dimension- 
al space, and the A™ and B® are the different irreducible linear 
representations of the group that we constructed in [69]. We show 
that in this case the product 


Dy {2, 8,7} X Dp {a, By} 


is reducible, and we find the irreducible representations that compose it. 


We take as an example the Schrédinger equation for two electrons in the 
field of a positive nucleus. The equation is of the form 


2 2 2 
|--rw  (Ger + ae tte)? | (130) 
where 
2 
Ve ee _ 
s=1 V2? + yy —22 2 ¥(@ — wz)? + (Y1 — Ye)? + (21 — 22) 


the constants having their usual significance. The second term in the expres- 
sion for V is due to the interaction of the electrons. If we neglect this interaction 
as a first approximation, the equation becomes 


(A, + BH.) y= Ey, (132) 
where 
he? e e 3? ee 
le Se ees eee ee eee oe = 2 
s Bx? m (eS TT ays as a aa) yetyta (8 1, 2) 
Suppose that the separate equations: 
H,y=Eyp; Hzy= Ey (133} 


have eigenvalues #, and #7, and corresponding eigenfunctions 


: Vi (Za Yrs 21) Nd Pe (Ler Yor Za») 
1.6. 
Ay, =Eyy, Bey, = Bes. (134) 


If we substitute in (132): 


P= Pr (Try Wy 21) + Var (Zar Yor Ze), 
we clearly get by (134): 


(2, + Fe) 9 =v, AY + ¥, Be ve = (EF, + Ee) oY. = (EF + Es) y, 


i.e. equation (132) will have the eigenfunction y, y,, to which (Z, + H,) cor- 
responds. The left-hand sides of equations (133) contain Laplace’s operator 
and the distance of a point from the origin, and they are consequently unchanged 
on carrying out a rotation of space about the origin. It may happen that more 
than one eigenfunction y, corresponds to the characteristic root H = FE, in 


264 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [74 


the first of equations (133). In this case, all the eigenfunctions in question, 
representing solutions of the first of equations (133), yield a lmear representa- 
tion of the rotation group, just as did the homogeneous harmonic polynomials 
of [69]. Let the representation be Dj, {a, 8, y}. In precisely the same way, 
the solutions of the second of equations (133) for a given eigenvalue E = E£, 
give us a representation D;,{a, 8, y} of the rotation group. According to the 
above, the product », », gives us a representation of the rotation group equal 
to the direct product Dj,xDj,, and to recognize the physical characteristics 
of the corresponding eigenvalue (FE, + E,) of equation (132) it becomes essential 
for us to distinguish the component irreducible representations. This circum- 
stance has a fundamental role in excitation theory. 


74. The direct product of groups and its linear representations, 
The concept of the direct product of matrices plays a part in another 
problem to which we shall now turn our attention. Let G and H be 
two groups, with elements G, and Hz, where the a and f run inde- 
pendently over in general different sets of values. We define a new 
group F with elements defined by pairs of elements of G and H: 


Pop = (G.» Hy), 


the first element being from G and the second from H. The identity 
(unit) element of the new group is defined as the Fy, when the G, 
and H; are the identity elements of G and H, and inverse elements 
of F are defined in a similar manner. We naturally define the multipli- 
cation rule for F by 


Pas pe os py = (Gar Fay: Hp, H,,)- 


As is easily seen, the set of Fy, in fact forms a group. We call the 
group F the direct product of groups G and H. Suppose we have a 
linear representation of G formed by matrices A® and a linear repre- 
sentation of H formed by matrices B®. It can be shown by using 
(123), as in the previous section, that the direct products 


OD — 40x BO 


give a linear representation of group F'. Moreover, if A® and B are 
unitary representations, the representation C@® of F is also unitary 
[72]. 

We now show that, if the represeniations A and B® are irreducible, 
the representation CO © of group F is also irreducible. Let the A@ be 
of order 2, and the B™ of order m. The matrices C“ » will be of order 
nm. Let a matrix X exist of order nm which commutes with all the 
c@") Matrix elements will be denoted by the corresponding small 


74] THE DIRECT PRODUCT OF GROUPS AND ITS LINEAR REPRESENTATIONS 265 


letters. We thus have, for any subscripts 7, 7, p, q, and for any a and f: 


m ps3 m % 
> > tim Mp OY = > > aD OP 2a; pa (135) 
fl kal 1=tk=l 


where 
aQrnP =e and a bP = c&. 4. 
If we take G as the identity element of G, A@ will be a unit mat- 
rix, ie. af = 0 fork # p and a = I, and (135) gives us: 


™ m 
2 Tio bi) = ABP ti oa» (136) 
and similarly, taking H® as the identity element of H, we get 
m cd 
> Lifgnp UD = SS AD Bij. p9- (137) 
k=1 k=1 


If we take nm elements zj. ,; and fix the subscripts ¢ and k, we get. 
the m? elements 
Xi, Kt (j, 2=1, 2,..., m) 


which give a matrix of order m. We write X“™ for this matrix. 
Similarly, on fixing j and J in 2,;, ., we geta matrix XG of order n, 
By (136), all the X%*” will commute with all the matrices B®, 
forming an irreducible representation of group H, and consequently 
all the X{! ) are scalar matrices, i.e. the elements jj. for fixed ¢ 
and k have the same value if 7 = 1, and are zero if 7 #1. We can 
write this as follows: 


Lijskt => Tish bi (138,}) 
Similarly, we have by considering the matrices X$ Dd; 


Liza SF Vy pear Six, (138,) 
where, as usual, 


Sng = 9 for p#q and 6,,=1. 


It follows from equations (138,) and (138,) that 2;;. ,. differs form 
zero only when 4 = & and 7 = J, in which case all the 2;;, ,; are numeri- 
cally the same, ie. the X commuting with all the C” is neces- 
sarily a scalar matrix. It now follows immediately that the linear 
representation of group F defined by the direct product A® x B® 
is irreducible. It can be shown that all the irreducible representations 
of group F are obtained in this way. 


266 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [74 


Let G and Z be groups of linear transformations with the same 
number of variables, and let any pair of matrices G, and H, commute: 


We have assumed in the above arguments that an element of group 
F is defined by a pair of elements (G,, Hz), and we have laid down a 
definite rule, written above, for multiplication within group F. In the 
present case we can take the elements of F as simply the matrix 
products (139) which are independent of the order of the factors. 
This new F is isomorphic with the previous group F. If G,, and H,, 
are unit matrices, the product G,, H,, = H;, Ga, is also a unit matrix. 
The matrix G;1 H;1 = H;1G,' is clearly the inverse of G, H;, and 
we have the following multiplication rule by (139): 


Ga, H g,° Ga, Hg, = (Go, Fa,) (Hg, Hp.) : 


i.e. all the previous properties are satisfied here in the formation of F, 
so that product (139) can be taken as the variable element of group /. 
We take the particular case when G is the group of rotations of three- 
dimensional space and Z is the second order group consisting of the 
identity transformation J and symmetry S with respect to the origin 
[57]. Condition (139) is satisfied here. If G, is any rotation of space, 
clearly G, S = SG,. In this case F is the group of all real orthogonal 
transformations of three-dimensional space. We had two first degree 
linear representations in [67] for H. One was the identity representation 
consisting of the number (+1), and the other was the anti-symmetric 
representation, in which (+1) corresponds with the matrix { and (—1) 
with matrix S. If we now take a linear representation Dj{a, 3, y} of 
the rotation group, we can take the direct product of a matrix of this 
representation with both these representations of the group of sym- 
metry with respect to the origin. We obtain in the one case a linear 
representation of the total group of orthogonal transformations in 
which the same matrix D;{a, 8, y} corresponds to every rotation with 
Eulerian angles {a, 8, y}, whether taken in the pure form or in associa- 
tion with symmetry with respect to the origin. We write Dj {a, 8, y} 
for this representation of the group of orthogonal transformations. In 
the second case, the matrix D;,{a, 8, y} corresponds to a pure rotation 
and —D,{a, 8, y} to a rotation in association with symmetrical reflec- 
tion. We write D; {a, 8, y} for this latter representation of the orthog- 
onal transformation group. 


75] DECOMPOSITION OF THE COMPOSITION D;x Dj OF LINEAR REPRESENTATIONS 267 


We shall discuss one further example of the direct product of two 
groups. Let (z,, ¥,2,) and (2, ¥,, 2,) be two points and G the group 
of rotations of three-dimensional space. Our variables now undergo 
the Jinear transformations: 


Lie = J1y Te + G12 Yk + G13 2: 
Ye = Joy Te + G22 Yu TP Jos 2x> (k =1, 2) (140) 
Ze = Js1 Te + Ys2 Yk + Goa 2K> 
where the array of 9;, is the matrix of a certain rotation. We suppose 
further that H is the group consisting of the identity transformation 


and the transformation corresponding to interchange of the subscripts 
1 and 2 in our points. This latter transformation will have the form 


‘1, 2 

We obviously have S? = 7, and the group A will therefore consist 
of the two transformations J and S. Given a rotation G,, clearly 
G,S = SG,, since it is a matter of indifference whether the re- 
numbering of the points comes before or after the rotation. We obtain 
here the same linear representations for the total group F as above. 
If we took n instead of two points, the group H, consisting of inter- 
changes of the point subscripts, would have for its elements linear 
transformations in n variables, and H would be isomorphic with the 
group of permutations of n elements. In this last case, the operations 
of rotation and of point subscript permutation similarly commute 
with each other, and the direct products of matrices of the linear re- 
presentation of the rotation group with matrices of the linear repre- 
sentation of the permutation group give us a linear representation of 
the total group F. 


75. Decomposition of the composition D; x D;, of linear representa- 
tions of the rotation group. We now return to our discussion of [73] 
of the Schrédinger equation for two electrons, where we saw that, 
neglecting electron interaction, the eigenfunctions of the Schrédinger 
equation give us a linear representation of the rotation group which is 
obtained by the composition of two linear representations of this group. 
The results of the previous section show that it is important for us to be 
able to decompose such a linear representation into its irreducible parts. 
This is the problem that concerns us in the present article, and it may 
be stated mathematically as follows. Suppose we have two irreducible 


268 THE BASIC THRORY OP GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [75 


representations D,{a, 8, y} and D,{a, B,y} of the rotation group. 
Their composition D; x D,, also gives us [73] a linear representation 
of the rotation group. We require to find the irreducible parts of which 
this representation is composed. 
The objects of the linear representation D; of order (27 + 1) are 
Un = ujem “ee 
VG+m)!G—m)! 
and those of D; are 


of’+m [/—m’ ’ a 7 or ” 
2 4 m=—7,—-74+1,.-...7 —1,7), (148) 


(m=—j,—G+l,.-.,.7—-1,f) — (142) 


Vie 
"VG tm ym)! 
where (1, %) and (2,, %,) undergo the same unitary transformations 
with (+1) determinant [68]. If we form the (27 +- 1) (2j’ + 1) quan- 
tities 
uj+m ui—m oft+n ol’—m 
G>m)!G—m)1(7 +m’) 1G —m’)! 
m= —}, os j+1 Bersg j—1,9 


a 


MS apie Gf oe 


Wan’ = Om m = (144) 


these will be the objects in the linear representation of the rotation 
group defined by the composition D; x D+. 

We shall assume in future that 7 and 7’ are either integers or half 
integers, i.e. to be more precise, we shall take linear representations 
of the unitary group in two variables with unity determinants. 

Let & be an integer (or half an integer) satisfying the inequality 


lj-7|<k<G+7. (145) 
We show that we can form 2k +- 1 linear combinations of magnitudes 
(144) such that they give a linear representation D; of the rotation 
group. 
For the proof, we form expressions of the type 
T= (tb Vg — ty 4)! (ey Fy Py Hy)" (Vy Hy +E VQ), (146) 
where / is a fixed integer satisfying the inequalities: 
1>0; 1<2j; L< 97’. (147) 


If the variables (u,, u,) and (2,, »,) undergo the same linear trans- 
formation 
Wy = yy Uy + yy Uys VL = yy Vy TF Ay Ve 
Us, == yy Uy + Ayp Uys Vg = gy Vy TF gn Ve 


75] DECOMPOSITION OF THE COMPOSITION D; Dy. OF LINEAR REPRESENTATIONS 269 


with (+1) determinant, i.e. a, dg — a4. 4, = 1, it may easily be seen 
that the first factor in (146) remains unchanged. For 
Uy Vz — Uy Vy = (yy Mog — Ayy Dp) (Uy Vg — Uy %,)- 
It is clear that (146) is a homogeneous polynomial in z, and 2, of 
degree 2(7 + 7’ — 1). It therefore consists of terms of the form 
a, x8 2U+i'—-D—s (s=0,1,...,.2G+7 —D) 
On introducing the notation: 
k=j+7—-lL (148) 
=a aktar af—m" 
VEtm)l(k—m)! 
(m’ = —k, —k4+1,...,k—1,b), 


Ym" (149) 


we can write (146) as follows: 


L= > Cat Yme- (150) 


The coefficients c,,. are dependent on the variables (w,, u,) and 
(D), %). 

It follows at once from (146) that c,,.is a homogeneous polynomial 
in (u, U,) of degree 27 and a homogeneous polynomial in (2%, v,) of 
degree 27’, ie. Cm. will consist of terms of the form 


' 2j— 2 fe 
ng ua us! Bo 4 v3 q, 


or we can say, on taking into account (142) and (143), that cp. is a 
linear combination of products: 


Cm = > > DU, Vy (m” = —k, —k+1,-..,4-1,%), (181) 
m a 


where the coefficients a) no longer contain uw, and v,. We observe 
that, in (146), the variables u, and v, only appear either in association 
with the factor z,, or in the first factor of (146), in which the sum of 
the indices of u, and 2, is equal to J. On observing that y,,. contains 
ai*™", we can say that, in the terms of (151), the sum of the indices 
of wu, and 2, is k + m” + J, or by (148), the sum is 7 + 7’ + m’. But 
U m contains u/+™ and V,, contains oft, and hence it follows imme- 
diately that each of expressions (151) contains only the products 
UnV m, for which m+ m’ =m”. We now show that linear combi- 
nations (151) of the U,V, in fact give a linear representation of the 
rotation group equivalent to D,. 


270 THE B4SIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [75 


We first recall the definition of contragredient transformation. 
Given the two linear transformations 


(2), + Py) S Ay .; B,) and (8h 25h) HB iy <2 a) 


the necessary and sufficient condition for the equation 


TY te bya = BY Tes + Lan 


to be valid is for B to be contragredient to A, i.e. B = A®)-1 (ef. [21] 
and [40]). 

Let the variables (u,, u,) and (v,,¥,) undergo a simultaneous unitary 
transformation A with (+1) determinant. Suppose that the variables 
z, and 2, have now undergone a transformation A*-1 contragredient 
to A. It follows from the definition of contragredience that the sums 


U, 2, + uz, xz, and v, x, + v2 2, 


now remain invariant. As was proved above, the first factor in (146) 
also remains unchanged with the above transformation. The total 
sum J therefore remains unchanged, in other words, by (150), the 
variables c,- undergo a transformation B, contragredient to the trans- 
formation C suffered by the yn.. 
We bring in the new variables: 
uke+me uk—m 


BL kL. 


em 


We can write on applying the binomial formula: 


+k 
(2oy yb Uy Hy) = (2k)! IS zy Yr 


m=u—k 


The left-hand side remains unchanged by the transformations, and 
the same can therefore be said of the right-hand side, i.e. the variables 
Zm- undergo the same transformation B, contragredient to C, as the 
variables c,,.. But we know that variables z,,. in fact give us a linear 
representation D, of the rotation group, if (w,, u,) are the objects of 
the unitary group with (+1) determinants. Our assertion is therefore 
proved. 

We can thus form (2% + 1) linear combinations of variables (144), 
which we shall interpret as vector components in space with (27 +1) 
(27’ + 1) dimensions, and the combinations give a linear representation 
D,, of the rotation group. On taking into account equation (148) and 


75] DECOMPOSITION OF THE COMPOSITION D;x Dj OF LINEAR REPRESENTATIONS 271 


inequality (147), the following values are seen to be assignable to the 
number k: 

ba FF ee 7 (152) 
We now find how many linear combinations of variables (144) can be 
formed. We assume for definiteness that 7 > 7’. The total number of 
linear combinations will be 

(+27 +) +2427 —Y+-.-+ (7-2 +0). 

This is the sum of an arithmetic progression, the number of terms being 


(27 + 27° + 1) — (27 — 27° +0) 


2 


+1=2 +1, 


and the total number of combinations is (27 + 1) (27’ + 1), ie. it is 
equal to the number of variables (144). The same result would be ob- 
tained on assuming 7 < 7’. On writing for brevity: 
(27 + 1) (27 + 1) =7, 
the linear combinations can be denoted by 
Wy, Way oy Wr (153) 


on the assumption that the combinations run in the same order as the 
linear representations D,, where & has the values given in (152). As a 
result of a unitary transformation with (+1) determinant on variables 
(U1, Uo) and (v,, v,), we get new values U1,V,, of variables (144) and 


new values wi (¢ = 1, 2, ...,7) of variables (153), where the ww; are 
given in terms of the w, by the quasi-diagonal matrix 
[Djajs Dye ~~» Diyj_y); (155) 


and each D, corresponds to the unitary transformation to which the 
(t,, Uy) and (v,, ¥,) have been subjected. We show further that the 
linear forms (153) of magnitudes (144) are line arlyindependent. Let T 
be the matrix of the linear transformation with the aid of which the w, 
are expressed in terms of the variables (144). The direct product 
D; x Dj; is the matrix of the linear transformation for variables (144), 
and we have by the above: 


(Djs; Djsje-v =. «yD j_5) = T(D;xD;) T—, (155) 


which gives the decomposition of the direct product into irreducible 
parts. The above expression is more usually written as follows: 


D xDyp = Djgy + Djrpat---+ Diy: (156) 


272 THE BASIO THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [75 


We recall that each D, is defined by a unitary transformation and is 


a) 
é fp The result obtained may be 
—d,a 
generalized for any number of factors. For instance, we can write 
D,xD,x D, = (D, + D, + Dy) XD, = 
=D,+D,+ D,+D,+D,+D.4+D,= 
= D,+ 2D,+3D,+ Dy. 
D, itself is a third order matrix [68]. The direct product D, x D, is 
a ninth order matrix, and finally D, x D, x D, is a matrix of order 
twenty-seven. The above equation shows that this last matrix is 
equivalent, with any choice of unitary transformation, to the diagonal 
matrix 


to be written out in full as D,| 


[D,, D,, Dz, Dy, Dy, Dy, Dy) - 
The order of this last matrix is [68]: 
(2-3+1) + 2(2-241)+3(2-141)+ (2-041) = 27. 


We now prove the linear independence of the w,, as linear forms of 
magnitudes (144). The w, are the cp, in the previous notation, except 
that we have to remember that we can take different values of k, or 
what amounts to the same thing, different values of 7, when forming 
the cpr, 80 that it would be more correct to write ce). As we have 
seen above, each c is expressed solely in terms of the U,,V,,, for 
which m + m’ = m". Hence it follows at once that only the c® with 
diffcrent ? but the same m” can be linearly dependent. On removing 
the brackets of the last two factors in (146) and collecting terms 
in a{*"™ 25-"", where k is given by (148), we in fact obtain, up to 
a constant factor, the c™. in terms of w, and v,. They are clearly the 
products of (u,v, — u,%,)' and a polynomial with positive integral 
coefficients in w,, %, ¥, and v,. It may readily be seen that these ex- 
pressions cannot be linearly dependent with different 1. Suppose, say, 
that we had linear dependence of the type: 


a, cH) + a, cl» a, ch) = 0, 
where 1, < 1, < 1, and the a, are non-zero constants. This relationship 
must be satisfied as an identity for any 1, wt, v,, ¥,. Suppose, for 


instance, u, = 0, = % = 1. By what has been said about the form 
of the c®. we get a relationship of the type 


Gy (Uy — 1) p, (uy) + oy (uy — 1) D2 (4) + 23 (uy — 1)2 ps (u,) = 0, 


718] ORTHOGONALITY 273 


where the p;(w,) are polynomials in u, with positive integral coefficients. 
On dividing by (w%— 1)" then setting u,— 1, the above gives us 
a, = 0, which contradicts what has been said and thus proves the 
impossibility of a linear relationship. 

Of course we could actually have constructed the expressions for 
the w, in terms of variables (144) by removing the brackets in (146). 


76. Orthogonality. Matrices forming non-equivalent unitary irreducible 
representations have the property generally known as orthogonality. They 
are often employed in applications of group theory to physics. We first of all 
formulate this property. 

Let G be a finite group of order m with elements 


Gy, Ga, «2 Gm 
and let 
AM, ...,A™ and B®, ..., B™ 
be two systems of matrices giving linear representations of G. If we write 
small letters with two subscripts for the matrix elements and assume that 


the representations are non-equivalent and irreducible and consist of unitary 
matrices, we shall find that we have the following equation: 


> aP sD =0, (157) 


s= 


~ 


this being valid for any values of the subscripts. Similar equations apply for 
a single irreducible unitary representation. Let p» be the order of matrices 
A), yielding an irreducible unitary representation. We have the following 
equations: 


a7 pareenerny 
> ap off = = 6,55, (158) 
s=1 Pp 


i.e. the sum on the left is zero if the pairs of numbers (2, 7) and (k, 2) are different, 
and is equal to m/p if the pairs are the same. 

The proof of orthogonality is based on Theorem] of [66]. We first recall 
the multiplication rule in the case of rectangular (not square) matrices. Let 
C and D be matrices with elements 


4=1,2,...,m f=1,2,...,m2 
D}; ad {C; “l, 
{ 1 eee eA ant By ame lah 
the number 7, of columns of D being the same as the number of rows of C. 
The elements of the product DC are defined by the usual expression 
ns 


{DO} ix = > {Dhis (C}s- 


s=1 


The new matrix DC will clearly have n, rows and n, columns. 
We now state a fundamental theorem. 


274 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [76 


TrEorem. If unitary matrices A® of order p and unitary matrices B‘) of 
order gq give non-equivalent irreducible representations of a group G, and if a 
rectangular matriz C with p rows and q columns satisfies for all s: 

ASG =CB® (¢@=1,2,...,m), (159) 


C is a zero matrix, i.e. all its elements are zero. 

We first take the case p = ¢g, when C is a square matrix. If the determinant 
of C differs from zero, there exists C-!, and it follows from (159) that 

A® — cB! Cr, 
* 
i.e. the two representations are equivalent, which contradicts the hypothesis 
of the theorem. The determinant of C must therefore vanish. Suppose that 
not all the elements of C are zero and that we write them as cy. We know that 
the linear forms 
Cy 2, + --- + Cip By (@=1,2, ...,p) 


define with arbitrary z, a subspace with a number of dimensions equal to the 
rank of C [14], i.e. the subspace here has a number of dimensions = 1 and < p. 
In other words, we are concerned here with a subspace FR and not the total space 
of p dimensions. We write (159) as a linear transformation on a vector with 
components (%, ..., 2p): 


AOC (2, «245 Zp) CBS (x, ..., 2) (8 =1, 2, ..., m). 


The O(a, ..-, Z,) on the left is an arbitrary vector of R, whilst the whole of 
the right-hand side, representing a linear transformation C on a vector 
BO(x,, ...,%,), also belongs to R. In other words, the transformation A’) 
on any vector of # again yields a vector of R. In this case, as we know from 
[66], the A give a reducible representation, which contradicts the hypothesis 
of the theorem. 

This proof remains in force if p> gq. The rank of C is now always less than 
, and the linear forms 


Cy ty t... + egty (= 1,2....,7) 


define a subspace R with a number of dimensions less than p; thus the proof 
remains as before. Suppose finally that p < q; we pass to the transposed matrices 
in (159), which gives us 

BY CO 1 CM 49), 


The order g of B&X*) is higher than the order p of A©X*), and we conclude 
from this as above that the unitary matrices B©X*) leave a subspace unchanged, 
so that we can reduce them to the quasi-diagonal form by a suitable choice 
of fundamental vectors. The matrices B& will also become quasi-diagonal, 
which contradicts the hypothesis of the theorem. The theorem is thus proved. 

We could have omitted the condition in the theorem that A® and BY are 
unitary. As we know, these can always be assumed unitary if we are prepared to 
pass to similar representations, in which case we get a new matrix C, instead 
of C in (159), C, being connected with C by a relationship of the form 


C=D,C, D.; 


and since C, is the null matrix, the same can be said of C. 


76] ORTHOGONALITY 275 


We now turn to the proof of (157). We introduce the notations A(G,) and 
B(G,) instead of A® and B, where G, is the element of G to which A) and 
B®) correspond. Let X be any matrix with p rows and g columns. We introduce 
the matrix 


C= 3 AG) XB(G) (160) 
s=1 


and show that it satisfies (159). 
Let G, be a fixed element of G. We have 


m 
A(G)C = SAG) A(G) XB(G)". 
s=1 


But by the definition of linear representation: 
A (G,) AG,=A(G,G,) and B(G,) B(G,) = B(G,G,), 
and hence 
A(G)C= > A (G, G,) XB (G,G,)" BG). 
s=1 
If G, runs over all elements of the group, the same can be said of the product 

G, G,, so that we can write the equation above as 

A(G)C =CB(G), 


i.e. the matrix C defined by (160) in fact satisfies (159), and C is consequently 
a null matrix. We thus have, for any choice of matrix X: 


m™m 
> A(G.) XB(G,)"1=0 


s=1 


Suppose that a fixed element {X}j, of X is unity, and the remainder zero. 
The last equation now gives us 


m 


= {A (G)}iy {B (G,)- n= =0. 


s=1 


Since the matrices are unitary, B(G,) is obtained from B(G,)—! by replacing 
rows by columns and all the elements by their conjugates, so that the last 
equation becomes in the previous notation: 


Se aD r= 


which is the same as (157). 
Similarly, by constructing the matrix 


m 
D= > A(G,) XA (G7, 


s=I 


276 THE BASIO THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GRours [76 


where X is any square matrix of order p, we can show that 
A(G) D=DA(G,) (s=1,2,...,m™), 
and we can say from Theorem III of [66] that D is scalar matrix, or 
™ 
> AG;) XA (G71 = el, 
g=1 


where the number ¢ depends on the choice of X. Again, let {X},, = 1 and 
the remaining elements of X be zero, and let c;, denote the corresponding value 
of the number c. We can write 


m 
> {4 (Gs)}ig {4 (Cs) Yue = pt Sie: (161) 
s=1 
To find ¢;,, we put ¢ = k and sum over ¢ from 1 to p: 
m P m 
pea = > D> {4 (Gh {4 (Gd) = = {Z}y- 
s= 


s=liz=l 


If 1 =j, the right-hand side is equal to m, whilst it vanishes with 14). 
Hence c;; = (m/p) 6;;, and (161) ean therefore be re-written: 


m 
= {4 (G)}y{4 Ge = a 3u.5j1> (162) 
s= 


which is the same as (158) if we take into account the fact that A(@,) is unitary. 

Relationship (157) may easily be seen to hold, not merely for unitary, but 
for arbitrary non-equivalent and irreducible group representations. Let A’(G,) 
and B’(G,) be two such representations of degrees p and q, whilst A(G,) and B(G,) 
are unitary representations equivalent to them, so that 


A (G3) =C, A’ (G,)Cy1;  B(G) =C, B’ (GCF, 


where C, and C, are definite matrices not depending on s. We have by the 
unitariness of B(G,): 


B(G,)"1 = B(G,)* = C;))* B (G)* Cz, 
and (157) can be written as 
m arene: Semmes 
DOA’ (G) O71 X Cp1)* B’ (G,)* Ct = 0, 
g=1 


whence, on multiplying on the left by Cy! and on the right by (C*)-1, and 
introducing the arbitrary matrix Y = Cy! X (en) with p rows and g columns, 
we get 


m 
> 4 (G)Y BGS =, 


s=1 


~] 


77) CHARACTHES 27 


and therefore, using the arbitrariness of Y as above: 


ia —— 

pA a‘? ui? =0. 

s=1 
We notice also that (162) is valid for any representation, unitary or not, 
as follows from the proof and the fact that it is not necessary to mention the 
unitariness of the A® and B®) in the statement of the previous theorem. 


77. Characters. Suppose, as above, that A(G,), B(G,) are two non-equivalent 
irreducible representations of orders p, g of a group G with elements G,, G, 
.- +, Gm» We shall write X(G,), X’(G,) for the traces of the matrices of the repre- 
sentations, i.e. the sums of their diagonal elements: 


Pp q 
XG)= S{AG}us XG) = D> {BG 
i=l k=1 
These numbers are known as the characters of the representations. The charac- 
ters of equivalent representations are clearly the same [27]; also, we can assume 
that the representations in question are unitary. The orthogonality formula 
gives 


™ ——— 
> (4 (Gs }i {B (Gs) ce = 9; 
s=1 


and summation over 7 and k gives the orthogonality formula for the charac- 
tera: 


Ms 


X (G.) X’ (G,) = 0. (163) 


0 
] 
_ 


Similarly, (158) gives 
m™m 
m 
Zi {A(G)}i {AG} = = Six 
and we have from summation over ¢ and k: 
m —— 
2 X(G,) X(G,) =m. (164} 


We shall prove a number of theorems by using these formulae. 

THrorEeM 1. The necessary and sufficient condition for two irreducible representa- 
tions to be equivalent is that their characters are the same. 

We have already mentioned that the characters of equivalent (reducible or 
irreducible) representations are the same, so that the necessity of the condition 
is established. We now assume the converse, that the systems of characters 
of two irreducible representations are the same, i. X(G,) = X’(G,) (¢ = 
= 1,2,...,m), and prove the equivalence of the representations. We have 
by (164): 


2, XG) XG) =m, 


278 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [77 


whence the equivalence follows, since otherwise we should have relationship 
(163). We notice the obvious point that the matrices in equivalent representa- 
tions must be of the same order. Corresponding to each irreducible representa- 
tion, we introduce vectors in the complex m-dimensional space F,, with com- 
ponents: 
1 1 ] 
= X(G,), la X(G.),--., Van X(Gmp) - 
These vectors are normalized by virtue of (164), and vectors corresponding to 
non-equivalent representations are mutually orthogonal by (163). Hence it: 
follows that there cannot exist more than m non-equivalent irreducible representa- 
tions of a group G of order m. We shall later define more precisely the total 
number of non-equivalent irreducible representations of a group. We shall 
denote this number by the letter J for the present. Let w\) denote these non- 
equivalent irreducible representations (¢ — 1, 2, ...,1) and let their charac- 
ters be : 
xa), X%G,),...,RG,) @=1,2.-.D- 


Suppose that there exists a representation @ with characters 
X(G,), X(G.),..., X(Gp) - 


As a result of reduction, is given by quasi-diagonal matrices formed from the 
matrices of representations w\. We thus have for the characters: 


rf 
X(G;) = 2 a, X%G,), (165) 


where the a; are non-negative integers which show us how many times the 
representation w appears in the constitution of » after its reduction. 

Expressions can be derived for the coefficients a; in terms of the characters 
of representation w. Let k be one of the numbers 1, 2, ...,2. We multiply 
both sides of (165) by X@(G,) and sum over s. We obtain, using (163) and 
(164): 


m Se 
2 XG) XG) = am, 
whence 


m ——— 

ay = 2 X(G,) KG, (166) 
s= 
This expression yields a definite value for each a,, whence we get the following 
theorem. 

THEOREM 2. Every reducible representation decomposes into a unique set of 
irreducible representations. 

By using (166), we can easily generalize Theorem 1 for the case of any rep- 
resentations, irreducible or not. 

THEorEm 3. The necessary and sufficient condition for two representations to 
be equivalent is for their characters to be the same. 

The necessity of the condition has been noted in the proof of Theorem 1. 
Conversely, if the characters X(G,) of two representations are the same, we get 


77] OHABACTERS 279 


like values for the a, by (166), and both representations consequently reduce 
to a quasi-diagonal matrix composed of the same irreducible representations. 
We can assume here, on passing to an equivalent representation if necessary, 
that the irreducible representations in question are arranged in the quasi- 
diagonal matrix in the same order, since a permutation of rows and columns is 
equivalent to passage to equivalent representation. 

Representations with the same characters thus reduce to the same quasi- 
diagonal matrix, i.e. they are equivalent. 

We now turn to the investigation of the total number / of irreducible, non- 
equivalent representations of the group G. The group elements are distributed 
into classes. We find in the same class the elements obtained from one of them 
G, with the aid of the expression: 


G,G,G3! (6 =1,2,--.,m). 


Similar matrices with the same trace correspond to all these elements in any 
representation. Let r be the number of classes in G. By what has been said 
above, every linear representation of G has not more than r different characters, 
where each character corresponds, not to individual elements, but to all the 
elements of a given class. Let the class C, consist of g, elements, C, of g, elements, 
and finally, C, of g, elements. The terms of sum (163) are the same for elements 
of the same class, and on writing X(C;), X‘(C;,) for the characters corresponding 
to the elements of class C,, we can re-write (163) for two non-equivalent irre- 
ducible representations as 


r 
2, XC) XO) % =9, 
whilst (164) becomes 
r 
2 XC) XC)%=m.- 


We thus have, for the cheracters X(C,) of non-equivalent irreducible represen- 
tations o ¢ = 1,2, ...,0): 


r 
Pa xM)C,) x™C,) %=O9 for i, Ait, 
(167) 


vr 
Pa XC) XC) g, =m. 


We introduce into the r-dimensional space FR, 2 vectors with components 


[& xOC,), | XO), ..., / x00)  G=1,%,..4D- 


The above equations show that these vectors are mutually orthogonal and 
normalized, and consequently linearly independent. It follows that the number 
1 of them is not greater than the number of dimensions, i.e. ] < r. This gives 
us the theorem: 

THEorEm 4. The number of non-equivalent, irreducible representations of a 
group is not greater than the number of classes of the group. 


280 THE BaslO THEORY OF GROUPS 4ND LINEAR REPRESENTATIONS OF GROUPS [77 


It is shown in the next article that we always have 1 =r. Since we have 
just proved that 1 < 7, the equality follows if we can show that | > r. The 
proof of this latter inequality is bound up with the introduction of certain new 
concepts and relationships regarding characters which are of interest in them- 
selves. 

We establish a further relationship between the characters of any irreducible 
representation. Let the class C;, consists of the elements G, GW, ..., W. 
The expression G, G“GS1 (¢ = 1, 2, ...,9,), where G, is any element of the 
group, again gives us all the elements of class C,, though now in a different 
order. It follows that, if we take the set of all the products of elements of two 
classes C, and C,: 


GP) E® (U=1,2,..-5Gp3 v=1,2,..-59,); (168) 
the expression 


@, AP) GO G51 = (4, Pas) (6, EM GF) 


gives us the same set of elements. This implies the following property of set 
(168): if an element belongs to the set, the whole of the class containing the 
element likewise belongs to the set, where each element of the class appears in set 
(168) the same number of times. We write Goo, for the non-negative integer indica- 
ting how many times elements of clasa C, appear in set (168). All this may be 
expressed in a purely symbolic form as 


F 
CpCq= Py Ap Ce (169) 


or 


(GP) + OP + .-- +E) (GM+ GM+... + G®) = 


& 


= Z Bog (Af) + GY +... + EM). (170) 


Let A(G,) be the nth order matrices of an irreducible linear representation of 
group G. We form the sum of the matrices corresponding to elements of class 
C,, and call the new matrix A(C;): 


&k 
A(C,) = Aa) . 


On observing that the elements G, Gest with t= 1, 2, ...,g, and any G, 
of G give the total set of elements of class C,, it will be seen that the matrix 
A(C,,) commutes with all the matrices A(G,). Hence it follows that A(C,) is 
a scalar matrix [66], so that we can write: 


AC) =b)I (k=1,2,..47), (171) 


where the 6, are numbers. If we make use of the definition of the numbers 
poz, 1.6. of symbolic expression (170), we get the following relationship between 
the 6;: 


r 
by by = Py Gn qk Dy - (172) 


78] REGULAR REPRESENTATIONS OF GROUPS 281 


The trace of matrix A(C;) is equal to the sum of the traces of the A(G) 
(4 = 1,2, ...,9,), ie. is equal to g, X(C;,). On the other hand, it follows from 
(171) that the trace of A(C,) is equal to nbg, i.e. nb, = g;, X(C,), whence 


= XO), 


and relationship (172) leads to the following theorem. 
Tutorem 5. The following relationships hold between the characters of any 
irreducible representation formed by n-th order matrices: 


Ip X(Op) ¥q XCq) =" 3 Ong X(Cy)- (173) 


We note that one of the classes C; is that consisting merely of the identity 
element Z of the group G. Corresponding to this we always have the unit matrix 
of a linear representation, the trace of which is equal to its order 7. This class 
will always be denoted by C,, so that X(C,) = 7, and the above expression can 
be re-written: 


r 
9p X(Cp) 9q X(Cq) = XC) Py Sp qr 9k X(C) « (174) 


We now find the values of constants a),,. There corresponds to each class C, 
a class C,,, consisting of the inverse elements to those of C,. This follows at 
once from the definition of class and the fact that the equation G, G, Gy = G, 
leads to G, G71 Gp! = Gj}. 

The class C,, can coincide with C,, i.e. it may happen that p = p’. In every 
case, C, and C,, contain the same number of elements, i.e. Ip = Gp- If we take 
g = p’ in (173) or (174), class C, will appear g, times on the right-hand side, 
whilst with g 4 p’, the right-hand side does not contain C,, 1.2. 


0 for g#>p’, 


: (175) 
gp for g=p’. 


orn | 


78. Regular representations of groups. We have already mentionod a method 
of representing a finite group with the aid of a permutation group. Any 
permutation group can be expressed in the form of a transformation group. 

For suppose we have the permutation 


1, 2, 3, 4 
2, 4, 3 1, 


this can be written as the linear transformation by which x, becomes ¥2, 7; 
becomes y,, 2, becomes y,, and x, becomes y;,: 

y, = Ox, + 02x, + Or, +2, 

y2=%, +02, +02, + 07, 

y, = 0z,+02,+2, + Oz, 

Y,=0r,+7, +0254 O2,. 


282 THE BASIO THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [78 


We consider the following representation of a group @ by a permutation group. 
We multiply the elements G,, G., ..., GZ, on the right by an element G,. This 
leads to a permutation of the elements, i.e. by what has been said above, toe 
matrix P,, which is regarded as corresponding to the element G,. This is generally 
known as a regular representation of the group G. One of the G; is the identity 
element of the group, which we denote as usual by #. The unit matrix P, 
corresponds to this, and its trace is therefore m, i.e. X(#) = m. On multiplica- 
tion of elements G,, G,, ...,G,, by some element G,, no element G, remains 
in place, i.e. all the diegonal elements are zero in the corresponding matrix, 
and in a regular representation X(G,) = 0 for G, 4 E. 

Suppose that, on reduction, a reguler representation contains the repre- 
sentation w), that we have discussed above, h;, times. We have with this, by 
whet has already been said: 


z 
0 for G,#=E 
fy XO(G,) = 3 176 
Zi ‘ (Gs) [r, for G,=E. (176) 


On multiplying both sides of this equation by X® G,) and summing over a, 
we get by (163) and (164): 
hkym= mE); 


but if we write n, for the order of the matrices in representation wo), we have 
XZ) =n, whilst from above, X@(#) = X®(E) = hy, whence ny, = hy, 
end (176) can be written as 


{ 
2 xB) 2%) = = my XOG,) = 


0 for G,#2F 


177 
m for G,=E£. (ri) 


We thus arrive at the following theorem. 
THEOREM 6. A regular representation contains each irreducible representation 
ow a number of times equal to the order n,, of the matrices in the wo), the characters 


of the w being given by (177). 
We now write down (174) for representation wo): 


Tr 
9p XC,) 9, XC.) = XC,) 2 tpg, KO'C;) Gu. 

and we sum over ¢ from 1 tol: 

Ls : r I. 

9p 9q X, XCp) XC) = 1D ange X XC) XC) cx- 
We obtain, on taking (177) into account: 
9p 9q z XO) XC) =ayam, 

ie. by (175): 


2, xG,) x%@,) = 


1 0 forg#p’ 
(178) 


— for q=97'. 


79] EXAMPLES OF REPRESENTATIONS OF FINITE GROUPS 283 


We form the set of J homogeneous linear equations in 2, ry, .-., %,: 
v 
PR) =0 (&=1,2,...,) (179) 


and show that it only has the zero solution. 


For, on multiplying both sides of (179) by X“(C,) and summing over k, 
we get x,, = 0, where p’ is any of the numbers l, 2, ..., r. Since system (179) 
has only the zero solution, the number of equations is not less than the number 
of unknowns, i.e. 1 > r. We showed earlier that 7 < r, whence it follows that 
i=vr, 10. 

THeorEM 7. The toial number of non-equivalent irreducible representations of 
a finite group G ts equal to the number of classes of G. 

We further notice a consequence of Theorem 6. The regular representation 
of group G consists of matrices of order m. On the other hand, by Theorem 6, 
it contains each ropresentation wo, consisting of matrices of order ny, 
times. 

This gives us the equation: 


vr 
> mi=m, (180) 
k=1 


which may be stated in words as: 
THEOREM 8. The sum of the squares of the orders of the non-equivalent irreducible 
representations w™ i equal to the order of the group G. 


79. Examples of representations of finite groups. 1. We take the Abelian 
group @ consisting of elements A‘ A!, where i =0,1,2,....m—1; k= 
=0,1,2,...,27— 1, the A, and A, elements commute, A" — EH, 43 = E, 
and with i = k = 0 we have to take A? A? = EH. Each individual element of 
G forms a class, and all the irreducible representations of the group are of the 
first degree. Let a and £be values of the mth and nth roots of unity. We associate 
with the element A§ A! the number “a, and thus obtain a group representation, 
as is easily seen..Assigning to a and § all possible values of the above-mentioned 
roots, we get altogether mn different first degree representations, The total 
number of classes, i.e. elements, is also equal to mn, and all the non-equivalent 
irreducible representations are thus obtained. The construction of the representa- 
tions is similar in the case when the number of factor ‘‘elements” (i.e. of A,) 
is more than two. 

2. We turn to the nth order dihedral group. It consists of the 2n elements: 

EZ, A',7, PA! (¢=1,2,...,n—1), 
where 
4°=E; T=E; PATAI=A1 (TA=7). (181) | 


The last of the relationships written is immediately obvious from the geometrical 
meaning of rotations A and 7. An immediate consequence of this is the rela- 
tionship TAi 7-1 = A-}, First let n = 2m +4 1 be odd. The group will now 
consist of (m -+- 2) classes. One of these contains #7; m consist of the two elements 
AS’ and A-* (¢ = 1, 2, ...,m); and one contains all the elements of the type 


284 ‘THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GRovrs [79 


T and TA/. All this may easily be verified with the aid of the above relation- 
ships. 

There exist two first degree representations; in one, the number 1 is associated 
with each element; in the other, 1 is associated with element A and (—1) with 
T. Now let e = cos 2x/n + ¢sin 2x/n. We can form m second order representa- 
tions by associating with the elements A and T the matrices: 


, 0 


a= 0, «° 


0, 1 
7 rol (¢= 1, 2, .+.,m). (182) 


These matrices satisfy relationships (181) and in fact yield the group representa- 
tion, sinco evory relationship between elements A and T is a consequence of 
(181). The irreducibility of each of the representations follows from the fact 
that, otherwise, a representation would reduce to two first order representations, 
and the matrices of the representation would have to commute, which is not 
in fact the case for any ¢, as may easily be verified. 

The non-equivalence of representations (182) for different s follows from the 
fact that the matrices corresponding to element A have different sets of charac- 
teristic roots e° and «~* for different s. All (m -++ 2) non-equivalent, irreducible 
representations have thus been obtained. Equation (180) amounts in the pre- 
sent example to 

2-124 m-2 = 4m42=2n. 


With even n = 2m, representation (182) corresponding to the value 3 =m 
has the form 
| 0, 1]] 
E r-| 1. ol! 
Its Vi 


and splits up into the two first degree representations 
A-(—1); T-+(+1) and A-—(-1); %-(—}). 


To obtain this in addition, it is sufficient to utilize a matrix S such that S7'S-! 
reduces to the diagonal form, the characteristic roots of 7’ being clearly equal 
to +1. Thus with n = 2m, there are four first degree representations and 
(m — 1) second degree. Equation (180) becomes 


4-124 (m— 1) 2? =4m=2n. 


3. We considér the representations of the tetrahedral group or, what amounts 
to the same thing, the alternating group isomorphic to it with mn = 4 [59]. 
The group consists of four classes and its order is twelve. It must have four 
non-equivalent irreducible representations. The degrees of these representations 
Inust satisfy the equation 


nt + ng tnt+nt=12. 


This equation has a unique positive integral solution, discounting the order 
of the terms on the left-hand side: 


hm =n = 7, =1; n, = 3, 


80] BEPRESENTATIONS OF A LINEAR GROUP IN TWO VARIABLES 285 


i.e. the group has three representations of the first degree and one of the third. 
In the first degree representations, the same number corresponds to elements 
of the same class, and the correspondences may easily be seen to be as follows 
for these three representations: 


Is]; N-+l1; DI-+1; IW-1 

I—1; OD-1; Woe; IV-2 

I-]; O-1; W-#8#, IV—e, 
where 


2% 
37° 


e= coe + 4 sin 


The third order irreducible representation gives the tetrahedral group itself, 
i.e. the group of rotations of space (third order matrices) for which the tetra- 
hedron is displaced into itself. If this representation were reducible, it would 
have to reduce to three first degree representations, which is impossible since the 
group is not Abelian. The last sections have been concerned with the theory for 
finite groups. An extension to rotation groups requires a more detailed treat- 
ment of infinite groups that depend on parameters. Before we pass to the general 
treatment of these latter groups, we consider the problem of linear representa- 
tions of the Lorentz group. These representations, together with those of the 
rotation group, will serve us as fundamental examples of infinite parametrically 
dependent groups. 


80. Representations of a linear group in two variables. We con- 
structed in [68] linear representations of a unitary group in two 
variables, which led us to linear representations of rotation groups. 
Representations can similarly be constructed of a linear group in two 
variables with unity determinant: 


Mie OO aS ie he a (183) 
z= cx, + dz, 

This leads us, by what was said in [64], to one and two-valued repre- 

sentations of the group of positive Lorentz transformations. We arrive 

at results entirely different to those of [68]. 

One possible linear representation of unitary group (93) is the repre- 
sentation by the group itself, i.e. the linear representation in which, 
corresponding to a given transformation, we have the same transfor- 
mation. Another linear representation is easily seen to be the following: 
to each transformation (93) there corresponds the transformation with 
complex conjugate coefficients: 


286 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [80 


But this representation is equivalent to the previous one, as follows 
directly from the fairly obvious equation: 


0,1 a,b a,b 0,1 
— 1,0, ||| —3,4@ —b,a!|||—1,0]] 
The conjugate representation for group (183): 
Yi = Ay, thyei yx = Ty, + dy, (184) 


is not equivalent to group (183) itself. To see this, we need only con- 
sider the case b = ¢ = 0. The matrix transformation of (183) now has 
characteristic roots @ and d, whilst the matrix of (184) has roots @ 
and d. We can clearly choose complex numbers a and d satisfying the 
condition ad = 1 such that the set @ and d is different from the set 
a and d, so that the corresponding transformations cannot be similar. 
We have thus already obtained two non-equivalent second degree re- 
presentations — the group itsclf (188) and group (184). We discuss 
below the irreducibility of the representations. 

We can moreover construct representations of group (183) precisely 
as we did in [68]. We only need to replace @ by d and b by (—c). This 
gives us the following representation of order (2j + 1), where j is a 
sao integer or half integer: 


fot. Mg —k—ai(7+l—kpi(k+s—D}! 


MS pe ae 3 
K gitl-k HK ckt+s—I qi-k (i= b= > leis ba) (185) 


Here 2 and s run over the following values: 
lands=—j,—j+1,....j—1,j, 
and the summation over k& is defined by the inequalities: 
k>0;k>l—s; k<j—s k<j+l. 


We have to take 0! = 1 and 0° = J in (185). The identity represen- 
tation by unity is obtained with 7 = 0. We can at once write further 
representations in addition to (185) by replacing the numbers a, b, c, d 
by their conjugates on the right-hand side of (185). We shall denote 
the corresponding representations as follows: 


Dy 1e (7 =0, eee eal? (186) 


We can now form a composition of representations (185) and (186) 


80] REPRESENTATIONS OF A LINEAR GROUP IN TWO VARIABLES 287 


[73], as a result of which a new representation of order (2j + 1) 
(277 + 1) is obtained. We denote this as follows: 


a,o 
Eyy : i " (187) 


By using (185), we can easily write down the elements of the matrices 
corresponding to this representation. We take two different represen- 
tations (187), though of the same order: 


0 
Bra{e jhand Envaf” fi CP +D Ca+) = 2p, +1) 2a +1). 


¢, ’ 


We show that these two representations are not equivalent. We put 
b = c= 0. Matrices (185) now reduce to the diagonal form with dia- 
gonal elements 


Dele 4 =a@itgiat ~=—j,-j+1,...,j—1,j). 
0,4) 1 


The direct product of two diagonal matrices is also diagonal, and 
matrices EH, 4 and Ey, 4, consequently have the following charac- 
teristic roots for b = c = 0: 


Epg: aia—iayerm(gy-m (=~ PPT) oe 
m= —¢q,—@_tl,....¢—L¢ 
pug: PEt dri (gyri (qyn—m 
| 4=—pyp—pt+lL.-.m— ee 
Mm =—G—- GW tL.-4.%4—-1LYy 
or, on observing that ad = 1: 


Eyg: O(a)"; By p: 2 (a). 


E 


We can take any non-zero complex number for a, and it can clearly be 
chosen so that the set of characteristic roots of the H, , differs from 
the set of roots of the E,, .,, which proves the non-equivalence of 
representations (187) for different choices of 7 and 7’. We observe 
that, with j’ == 0, representation (187) is the same as representation 
(185), whilst with 7=0 it is the same as the representation obtained from 
(185) with 7 = 7’ and a, 8, c, d replaced by their conjugates. A singular 
feature of representations (187) may be noted. They are not equivalent 
to unitary representations. If they were, all the characteristic roots of 
any representation matrix would have to have unit moduli, whereas 
we saw above that the characteristic roots for representations E, q 


288 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [80 


with b = c = 0 are equal to a”(a)*™ and can evidently have moduli 
differing from unity. The only exception is Fy, which is the trivial 
identity representation in which unity corresponds to each element 
of group (183). 

We saw in [66] that, if a given representation, not necessarily equi- 
valent to a unitary representation, is reducible, i.e. equivalent to a 
representation with quasi-diagonal matrices of the same structure, a 
matrix must exist which commutes with all the matrices cf the re- 
presentation and which is not a scalar matrix. Hence, to prove the 
irreducibility of any representation (187), we only need to show that 
any matrix commuting with all the matrices of (187) must be scalar. 
This can be done precisely as in [68]. Representations (187) are thus 
mutually non-equivalent, and each is irreducible. Use is often made 
of a definition of reducibility different from that of [58]: a represen- 
tation is said to be reducible if all its linear transformations (say of 
order 7) leave unchanged a subspace L,, where 0< k <n. 

We have scen [58] that if a representation reducible in this sense 
consists of unitary matrices, it is reducible in the sense of the definition 
of [65], i.e. it is equivalent to a quasi-diagonal representation. If a 
representation is not unitary, reducibility in the sense of the definition 
of [65] does not follow from the invariance of a certain subspace. 
It can be shown that every group representation (187) is not only 
irreducible in the sense that we have indicated, but it leaves no sub- 
space unchanged. It can further be shown that every linear represen- 
tation of group (183) is either equivalent to one of representations 
(187), or equivalent to a representation having a reduced formula 
and consisting of several of representations (187). 

We saw in [73] that the composition of two linear group represcn- 
tations is equivalent to multiplying the objects of the representations. 
We can say in view of this that the objects of representations (187) 
are the expressions 


etal yf yf 
Mk VGLEIG—B! VE FENG)! 

k=j,j7—1,...,-f7+1,—4 

waif le--F 41-9) 


where x, and 2, undergo transformation (183) and y,, y, undergo (184). 

We have spoken so far of linear representations of the group con- 
sisting of positive Lorentz transformations [64]. The positive trans- 
formations form only a part of the Lorentz transformations with 


81] THEOREM ON THE SIMPLICITY OF THE LORENTZ GROUP 289 


unity determinant; in addition, there exist the Lorentz transformations 
with (—1) determinant. The study of the structure of these more 
general sets of transformations and the extension of linear represen- 
tations of the positive Lorentz transformation group to the total 
Lorentz group presents certain special features by comparison with the 
group of orthogonal transformations in three-dimensional space. It 
must be noted that we can Jay down the requirement, when defining 
the total Lorentz group, that the direction of measuring time remains 
unchanged. In this case, we must add reflection to the Lorentz group: 


7, % fo "a (Se a ae 
i= — 23 % = — XM; %e— — By; Ty — A. 


A discussion of all the points mentioned can be found, for instance, in 
Cartan’s Lecons sur la theorie des spineurs (Hermann, Paris, 1938) 
and in Van der Waerden’s Die gruppentheoretische Mcthode in der 
Quanienmechanik (Springer, Berlin, 1932). 


81. Theorem on the simplicity of the Lorentz group. We now show, 
by using a method similar to that employed in [70], that the Lorentz 
group is simple. For this, we only need to prove that there are no 
normal subgroups for the group G of transformations (183) other than 
the subgroup consisting of matrices E and (—£). Suppose there is such 
a subgroup H,, containing the matrix 


| (ad — bc = 1), 


differing from # and (—). We want to show that H, is the same as G. 
If H, contains a matrix B, it also contains all the matrices U-! BU, 
where U is any matrix of G. On taking into account the basic result 
concerning the reduction of matrices to the canonical form, as also 
the fact that the determinant of the matrix U7, reducing any given 
matrix to the canonical form, can always be taken equal to unity [27], 
we see that it is sufficient to show that H, contains in the first place 
matrices with any permissible different characteristic roots t and ¢-1, 
where ¢ is any complex number differing from zero and (+1). We 
observe here that the product of the characteristic roots of a matrix 
of group G must be equal to unity. In the second place, H, must con- 
tain matrices # and (—£), and furthermore, taking into account the 
case of equal characteristic roots and of a double elementary divisor, 
we have to show that H, also contains the matrices 


| . 
Jade aa 


. 188 
1,1 1,-—1 a 


290 THH BASIO THEORY OF GROUPS AND LINEAR REPRESENTATIONS oF GROUPS [81 
We take a variable matrix of group G: 


zy 
Zz, x 


X= 


and form the matrix 
Y=A(XA1Y) 
which, as in [70], must appear in H,. We obtain for the trace s of Y: 
s= 24+ be? + cy? — [(a — d)? + 2be] yz. 


Since A differs from # and (—£), we cannot have simultaneously: 
b=c—0and a= d. Hence s is not constant, and on varying z and 
y, we can assign to s any complex values. The characteristic roots of Y 
are given by the quadratic equation 


Z—sA+1=—0. 


We can thus obtain arbitrary values ¢ and ¢-! for these roots, and 
consequently H, contains all the matrices with different characteristic 
roots and unity determinant. H, clearly also contains Z, as also (—£), 
which can be written as the product: 


—E=[t,t) =(-,-4, 


where each factor belongs to H,. Moreover, matrices (188) are readily 
written as the products of two matrices with unity determinant and 
with different characteristic roots, whence it follows that H, also con- 
tains these matrices. For 


oily a 
[ee l-[e op 
all | P| 
(#0 and +1) 
Je -fesh aa 
= 1 lj}: 
= FF 


We have now shown that H, must coincide with G, ie. G has no 
normal subgroup except that consisting of # and (—£), and the 
positive Lorentz transformation group is shown to be simple. Hence 
it follows as in [70] that the group cannot have homomorphic (not 
isomorphic) representations. 


82] CONTINUOUS GROUBS. STRUCIURAL CONSTANTS 291 


82, Continuous groups. Structural constants. The groups of rotations 
of three-dimensional space and of positive Lorentz transformations 
provide examples of infinite groups which depend on continuously 
varying parameters. The role of parameters can be played say by the 
Eulerian angles in the case of rotation groups. The groups consist 
in these cases of linear transformations, and the parametric dependence 
of the groups amounts to the parametric dependence of the elements 
of the matrices by which the linear transformations are defined. 
Groups of linear transformations are discussed below. 

Let the matrix elements aj, of the linear transformations forming 
a group G be functions of 7 real parameters a,,a,,...,a,, and let 
certain conditions which we shall indicate next be fulfilled. Let the a,, 
be single-valued functions of the a, for all values of these parameters, 
sufficiently close to zero, and let the identity element of G, charac- 
terized by the conditions a, = 0 for ¢# k and a;;= 1, correspond 
to the zero values of the parameters a, = a, = ...=a,= 0. Suppose 
further that definite values of the a;, sufficiently close to zero, corre- 
spond to the elements of G neighbouring the identity element. The 
closeness of a group element to the identity element amounts to the 
fact that the elements a;, of the corresponding matrices are near zero 
for 7 #k and near unity for 1 = k. We have with these assumptions 
@ one-to-one correspondence of elements of G in a certain neigh- 
bourhood of the identity element with points of a neighbourhood 
round the origin of the real 7-dimensional space 7,. We shall 
consider later, not just this local one-to-one correspondence, but 
a one-to-one correspondence as a whole, in which to each element 
of G there corresponds a definite point belonging to a domain V of 
the space 7, containing the origin as an interior point, and conversely, 
to any point of V there corresponds a definite element of G. For the 
present, we only require the above local correspondence. Elements 
of G will be written G,, G;, G, etc., corresponding to parametric 
valucs as, Bs, ys (8 = 1,2, ..., 7). Taking the local view-point, the 
parameters must be fairly close to zero and the group elements fairly 
close to the identity element. 

We consider a product of group elements: 


G,G,=G,. 


The parameters y, characterizing the element G, obtained as a result 
of the above multiplication are single-valued functions of the a, and £,: 
£ 


Ys = 7,(B,, B2, oat pe Gy, Aq, - + oy a,) . (189) 


292 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [82 


We take these functions to be continuous, with continuous derivatives 
up to the fourth order for all a, and £, fairly near zero. 

The correspondence of the identity element to the zero values of the 
parameters gives us at once: 


¥5(B,, Bo, Serax B; 0, 0, aicisy 0) = 8; 


s=1,2,...,7), 190) 
p (0, 0,..-, 0; @,, 0, ---, 4) = a, ( ) ( 


whence 
a= m for a,=0; 
ay; (9 =I), 2, .2.45°F) (191) 
“8, = Ou for £,= 0. 


The parameters a, corresponding to the inverse element Gz1 are 
evidently given by 


~ 


Gs(B,, Ga, ---, G5 Oy, , ---, @,) =O (s=1,2,...,7r), (192) 


these equations being valid if all the a, and ag, are set equal to zero. 
The functional determinant of the left-hand sides of equations (192) 
in Gs is, by (191), equal to unity for a, and a, equal to zero. Hence, 
by the implicit function theorem, equations (192) define the a, as 
continuous functions for all a, fairly near zero, the as being zero for 
a, = 0. We expand functions (189) in powers of a, and §,, using 
Maclaurin’s formula, the expansion being carried out as far as the 
third order terms. We obtain, on taking into account (190) and (191): 


Ys = O,+ 8, + = a, a; By + 2, al), a; a, B, + 


+ =, BO, 1a; Bx By +e, (193) 


where the af}, al. 1 and 0{*, , are numerical coefficients, «™ is of not 
less than the fourth order of smallness with respect to a, and f,, and 


the summation is over 7, k and 1 from 1 to 7. The numbers 
CP=aP—aAP  (8,4,k=1,2,.-.,7) (194) 


are known as the sfructural cqnsiants of group G for the parameters a, 
chosen. 
If we bring in new parameters a; instead of a,: 


d,=0,(},0},-.., 01) (8=1,2,...,7), 


so that (1) w,(0, 0, ..., 0) = 0, (2) the equations written are uniquely 
soluble with respect to the a; for all a, sufficiently near zero, and (3) the 


82] CONTINUOUS GROUPS. STRUCTURAL CONSTANTS 293 


functions w, have a sufficient number of derivatives, then the struc- 
tural constants in the new parameters a; will be different. 
Tt follows at once from definition (194) that 


CP = aoe CY. (194,) 
The fone further relationships between the structural constants 


can be proved by using (192) and the associative rule for multipli- 
cation of the group elements: 


yee CY + CY CY + CL CH) =0 (4,7,k,¢=1, 2,..., 7). (1949) 
$21 


These relationships will not be used and their proof is omitted. 

We return to (193). With a, and @, sufficiently near zero, the y, 
will also be near zero. Taking (191) and the implicit function theorem 
into account, we can say that (193) are soluble with respect to the , 
in some neighbourhood of the origin of the space T',: 


Be = VAP a Yar 009 Yes yy Oy 5G) (= 12.0457). (195) 
We note here that the condition: §, =0 (s = 1, 2, ...,7) is equi- 
valent to the condition: y, = a, (s = 1, 2, ..., 7). We use (193) and 


(195) to form two square matrices S(a;) and 7'(a,) of order r with 
elements Sj,(a,) and 7,(a;) depending on parameters a,: 


a4; a 0B; 
Six (eh (az Nile Pix (43) > (ort), a0 
(8,4, =1,2,...,7). (196) 


On recalling the differentiation rule for functions of a function and 
calculating the derivative of y; with respect to y, or the derivative 
of £; with respect to B,, we get: 


S(a,)T(a,) =H and T(a,)S(a,) = EZ, (197) 


where £ is the unit matrix of order r. It follows from (191) that S(a,) 
becomes the unit matrix for a, = 0. In view of (197), it now follows 
that T(a,) has the same properties. The structural constants may 
readily be expressed in terms of the elements of these matrices, in fact: 


(2) — ( Spx (as) Spi (a5) 
Ch = (“x ae ise as (198) 
or 
OF pi (s) OF px (as) 
AP) of PENS. PR ATS 
Ch = ( Oa, da, Jecgs (199) 


294 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [82 


For we have from (193) and (196): 


(p) — 8p | — OS px (a5) 
oa (ard; | je= ( aq, ig (200) 
and we can write on interchanging subscripts i and k: 
as 
ap = ( glee aoe (201) 


whence (198) follows immediately. We have further, on taking (197) 
into account: 


> Sy; (a,) T it (a;) = 5 ok- 
j=! 


We differentiate both sides with respect to a; then set all the a, equal 
to zero. Recalling that S(a,) and T(a;) become the unit matrix for 
a,= 0 (s=1, 2,...,7), we get 


(PSeees ). ot (“Fete i= 0, 
ie. by (200): - — 
ay = —(“pie) 


whence (199) follows as above. Expressions (193) define the basic 
group operation that gives the parameters y, corresponding to the 
product G,; G, in terms of the parameters a, and f, of elements G, 
and G,. It is clear from (193) that, for a, and #, near zero, the group 
operation reduces as a first approximation to: vy, = as -+ fs, 80 that 
the group is Abelian to a first approximation. If the group is strictly 
Abelian, we have 


; (By Ba, Caraga B,; a, a, bes -, Gp) = @, (@, dy, seer Gy By Bos ae: B,) 
(s=1,2,...,7) 


and af) = af) in expansions (193), ie. all the structural constants 
vanish for an Abelian group. For groups of a general type, the second 
order terms in (193) produce a trend away from commutativeness, 
the trend being indicated by the presence of non-zero structural con- 
stants. An expansion of the parameters a, corresponding to the element 
Gr! may readily be obtained by using (193). We do this by putting 
ys = 0 in (193) and replacing 8, by as. We get by the ordinary rule 
for differentiating implicit functions: 


%=—a,+ SaPaa, +P, 
Lk 


83] INFINITESIMAL TRANSBORMATIONS 295 


where «© is of at least the third order of smallness with respect to 
Gy, Ay, ~~~) Gp. 


83, Infinitesimal transformations. Suppose we have, as above, a 
continuous group G of linear transformations of order n defined by 
parameters a, (¢ = 1, 2,...,7). We shall write G, as above for the 
matrix of the transformation corresponding to parameters as, so that 
a linear transformation has the form 


x= Gu, (202) 


where u is any vector of the complex n-dimensional space #, and x 
is the transformed vector. We bring in the operation of matrix differ- 
entiation: if the elements of a matrix A are differentiable functions 
of a parameter ¢, the derivative of A with respect to ¢ is defined as the 
matrix whose elements are the derivatives of the elements of A, i.e. 
{ GA } — 4A 
dt fix de ' 
We obtain partial derivatives if the elements of A depend on several 
variables. 

Similarly, if the components of a vector z(2,, 2. -.-, 2n) of the space 
f,, are differentiable functions with respect to ¢, dz/dé is defined as the 
vector with components dz,/di, i.e. differentiation of a vector amounts 
to differentiation of its components [II, 107]. 

We now introduce the so-called infinitesimal transformations of a 
group G: 

l= e).,-0 (k=1,2,...,9). (203) 
The symbol J, clearly denotes a matrix of order x with numerical 
elements. 

We now return to (202) and let u be a fixed vector, i.e. its components 
are independent of the a,. Obviously, the transformed vector will in 
general depend on the parameters, and we next derive the fundamental 
differential equations of this vector. For this, we apply to both sides 
of (202) the linear operation defined by matrix G,: 


G,x = Gu, 


where G, = G,G,, and parameters y, are given in terms of a, and f, 
in accordance with the basic group operation (193). We differentiate 
both sides of the last expression with respect to £, then put 8, = 0, 


296 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [83 


i.e. py, = as. We obtain on using definition (203): 


Ix SP) 0" 


f= 


The first factor under the summation sign is evidently equal to the 
derivative of the right-hand side of (202) with respect to a;, and if we 
recall notation (196), we can re-write this last equation as 


(p=1,2,...,7). 


r 
I,x= SS), (¢,) 
j=l a] 


If we introduce the vectors 


ox Ox ox 
x(q Oa,’ oe) and Y qx, I,x,..., ,x), 


we can now write the above as a linear transformation: 
Y = S* (a,) X, 


where S*(a,) denotes the transposed matrix as usual. We find, on 
multiplying on the left by T*(a,) and taking (197) into account: 


~ x = T* (a,) Y, 


or in the expanded form: 


Oe 2 Tale) (p=1, 2, .-.47). (204) 
Pp j=l 


We have for the components 2, of the vector x defined by (202): 


Oxy ST - n Awe eee aie! 205 
I = 2 ip (4s) SUT jpuet pei2.. ol? (205) 


ap 


sey 


where the {Ij}, are the components of matrix I;, We must add to 
equations (204) for x the initial condition, following at once from (202): 


X|a=0 =U, (206) 


where u is an arbitrarily assigned vector. We observe that the T';>(as) 
appearing in the coefficients of (204) may be found directly in accord- 
ance with group operation (193). Equations (204) lead us to relation- 
ships between the I;,; these may be derived simply by writing down 
the condition that the second derivative of x with respect to a, and 
a, is independent of the order of differentiation. 


83] INFINITESIMAL TRANSFORMATIONS 297 


We have from (204): 


a 


ox 


r 
-3 (Peo 1x47 in (2a) Lia | 


day Gq, 
or, on replacing 6x/da, by its expression from (204) with p= q: 


a OL sp (as) 7 SST (a) Pug (a) LL 
oa da = = 0a, j* + = P Jp (a;) kq (a,) j Kx. 
pg = 5A q j=l k=l 


On interchanging p and g in the above and equating right-hand sides, 
we get the following corollary of system (205): 


— ( OF jp (4s) OF jq (2s) 
205 


=1k=1 


+330 p (25) Pg (45) — Tjq (2s) Pep ( a) Ile] = 0. (207) 


We put all the a, equal to zero in this relationship. We now have, 
on taking into account (199) and the fact that T(a,) = #, when all 
the a, vanish: 


Pi DI, + (pL, — 141) ]u=0, 


whence, in view of the arbitrariness of vector u, we have the following 
relationships between the infinitesimal transformations: 


T,1,—Iplg= SORT (p.g=1,2,--.,7)- (208) 
j= 


We have found the J; and proved relationships (208) by starting 
from a given continuous group G and using equations (204). We show 
that this system, or what comes to the same thing, system (205), has 
a unique solution for the given initial condition (206). Suppose there 
are two solutions. In view of the linearity of (204), their difference 
must also satisfy the system and must become the null vector with 
ds = 0. We thus want to show that the solution x of (204) with the 
zero initial condition is identically zero. To simplify the writing we 
shall take 7 = 3. Let our solution be x(qa,, a,, a). We write (204) for 
p= 1 and put a, =a, = 0 on the right-hand side. We get an ordi- 
nary differential equation with a, as independent variable and the zero 
initial condition. The solution is identically zero by the familiar unique- 
ness theorem [II, 50], ie. x(a,, 0,0) == 0. We now write (204) with 


298 THE RASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF Grotrs [84 


p = 2 and set a, = 0 on the right-hand side. This ordinary differential 
equation with a, as independent variable has the zero initial condition, 
as we have just shown: x(a,, a,, 0) = 0 for a, = 0, and consequently, 
by the uniqueness theorem, x(a, a, 0) = 0. We now write (204) for 
p= 8. This ordinary differential equation has the zero initial condition 
X(@,, @, 23) = 0 for a, = 0, and consequently x(a, a, a) = 0, which 
is what we wanted to prove. 

Hence (204) can only lead to a single finite transformation (202) 
for given infinitesimal transformations J; and given 7';,(a;), which are 
defined by group operation (193). In other words, the infinitesimal 
transformations define a group. This is essential as regards what 
follows. The proof of the existence of a solution of (204) is based 
on a general theorem for partial differential equations, which, as 
far as (204) is concerned, may be stated as follows: the necessary 
and sufficient condition for (204) to have a solution for any given 
initial condition (206) is that the square bracket in (207) vanishes 
identically with respect to a; for any choice of p and g. We shall make 
no further use of this existence theorem. 


84. Rotation groups. We take as an example the group of rotations 
of space about the origin. The corresponding third order matrices 
depend on three parameters. The role of parameters can be played say 
by the Eulerian angles. We shall now introduce different parameters 
@,, Gy, 24, in which all our future working will be performed. Any ro- 
tation may be considered as taking place about some axis 1, passing 
through the origin, in a counter-clockwise direction and by an angle 
not exceeding z. Two rotations by an angle x about axes in opposite 
directions now lead to the same final position. We can thus look on 
any rotation as a vector from the origin in the direction of the axis 
of rotation and with a length equal to the angle of rotation. The pro- 
jections (a,, ag, a3) of this vector on the coordinate axes in fact serve 
as our parameters. 

If we take a sphere V with centre at the origin and radius x and look 
on the ends of any diameter as identical, a one-to-one correspondence 
may be established between the points (a,, a, a,) of the sphere V and 
the elements of the rotation group. This applies not only in the neigh- 
bourhood of the origin and the identity element, but for the group asa 
whole, if we take the whole of the sphere V. All the matrices of the rota- 
tion group can be expressed in terms of parameters a,, a,, a, and the 
continuity and existence of derivatives mentioned above are satisfied. 


84} ROTATION GROUPS 299 


We shall not deduce (193) for the basic group operation in the 
present case; instead, we determine the structural constants by evalua- 
ting directly the matrices of the infinitesimal transformations. 

To evaluate J,, we can take a, =a, = 0, differentiate the trans- 
formation matrix with respect to a,, then set a, = 0. But for a, = 


= a, = 0, we have a rotation about the x axis by the angle a,, which 
leads us to the formulae: 


f 
T= 2% 
% = XZ, COS a, — X3 SiN ay, 
%3 = #7, 8in a, + 23 COB a,. 


We obtain on differentiating the matrix of this transformation with 
respect to a, then setting a, = 0: 


‘0, 0, 0 
p= 10, 0, —1}). (209) 
‘0,1, 0 
eee 0, 0, 14 ; 0, 1, 0; 
=|, % 00 es (210) 
|_Y 0, 0 re 


We can now evaluate the left-hand side of (208) _ thus find the 
structural constants. Elementary working leads us to the following 
three relationships: 

If, — 1,1, = Is; I, 13 — 131, = 1,3 Inf,—I,J3= J2- (211) 

If we expand the right-hand side of (202) in powers of a, and con- 
fine ourselves to first order terms, we get 

x=u-+ (4,1, + a, 1, + agJs)u 
Hence u undergoes the following change as a result of the transfor- 
mation: 

éu = a, l,u+a,J,u + a3izu. 

Each term on the right gives the change of u for a small rotation 
about one of the coordinate axes. For instance, we get the following 
change in the components (u;, ¥, U) of u for a rotation by the small 
angle a, about the x axis: 

6u,=0; by + — Ug); 6 Ug = Uy Q,- 


Here, as above, we have confined ourselves to first order terms in ay. 


300 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [85 


85. Infinitesimal transformations and representations of the rotation 
group. We next show the connection between the above discussion 
of infinitesimal transformations and representations of the rotation 
group. We shall assume a one-to-one representation in the neigh- 
bourhood of the identity transformation by matrices F(a,, a, a.) of 
order n, the matrix elements being assumed continuous and differen- 
tiable functions of the parameters a, a,, a,. Any rotation D can be 
obtained as the product of a finite number of rotations of the above 
neighbourhood, and the product of the corresponding representation 
matrices gives the representation for D. The representation as a whole 
may be many-valued, however, since we can return to the initial ro- 
tation by continuous variation of the parameters and hence obtain 
a new representation for this rotation. We had this situation previously 
in the case of two-valued representations of the total rotation group [69]. 

We have the same group operation for matrices F(a,, a), a,) a3 for 
the rotations themselves, and consequently the same structural con- 
stants. We can form infinitesimal transformations I, for the group G’ 
of matrices F(a,, a,,a,). The Z, will be nth order matrices connected 
by expressions (211). If the J, can be found, we can write down differ- 
ential equations (204) for a vector x of 2, since the 7',(a;) are defined 
solely by the group operation. These equations have a unique solution 
for a given initial condition (206), and this solution can clearly only 
be the transformation 


x= F (a), dy, a2) u 


which gives the representation of the rotation group in the neigh- 
bourhood of the identity element (transformation). 

In the present case, 7 = 3, and on passing to the components of x 
in (204), we get 3n equations for the n components of 


K(X, La, - ++, Lp)- 


The only point of importance to us below is that (204) cannot have 
more than one solution for a given initial condition (206). As already 
mentioned, this can be stated as follows: a representation of the rotation 
group is fully defined by tts infinitesimal transformations I,, I, Ig. 

It is thus entirely a question of determining the infinitesimal trans- 
formations of the representation, and we shall now consider this. We 
introduce new required matrices instead of 1,, I,, I: 


A,=—I1,+%],; A,=1,+i,; A, = ids. (212) 


85] DNPDNITESIMAL TRANSFORMATIONS AND REPRESENTATIONS OF THE ROTATION GROUP 301 


The following relationships are easily seen to hold between these, 
instead of (211): 
A, A, — A, As = A, 
A, A, — A, Az = — A, (213) 
A, A, — A, A, = 24s. 


The representations by matrices F'(a,, a,, a,) must include, in parti- 
cular, the representation of the Abelian group of rotations about the 
z axis, to the elements of which the matrices F(0, 0, a,) correspond. 
All these matrices simultaneously take the diagonal form by a suitable 
choice of fundamental vectors, since irreducible representations of an 
Abelian group are of the first order. Transformation F(0, 0, a3) be- 
comes, for these fundamental vectors [69]: 


F (0, 0, a,)v =e ¥ 
or, on setting 1 = — im and writing v,, for v: 
F (0, 0, 03) Vn = eT ines Ym- 


Since the condition that the representation be single-valued is laid 
down only in the neighbourhood of a, = 0 (s = 1, 2, 3), we cannot 
assume that m is an integer. We obtain from the above, on the basis 
of the definition of J,: 


Ag Vin = tls V,, = [Fo 0, a) ¥n| = igre yn } = MV mp» 
Hence 


Ag Vm = Mm; (214) 


i.e. V7, is the eigenvector of the operator A;, corresponding to the eigen- 
value m. If there are several eigenvectors, v,, denotes one of them. 

We now prove the following lemma: 

Lemma. If vis the eigenvector of operator A, corresponding to the eigen- 
value a, A,v, tf it differs from zero, is also an eigenvector of Ay, cor- 
responding to the eigenvalue (a +1), and similarly A,v is an eigen- 
value of As, corresponding to the eigenvalue (a — 1). 

We have A,v = av by hypothesis, whence by (213): 


A, (A, v) = (4, 43 + A,) ¥ = A, (Agv) + Apv = 
= A, (av) + 4,v = (2+ 1) A,v 


and similarly 
A; (A, v) = (a = 1} Agv. 


302 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [85 


The number of different eigenvalues of A, is not greater than n. 
There will be one or several of these with a maximum real part; we 
call this value (or one of the values) j, and write v, for the corresponding 
eigenvector. By our lemma, A,v; must relate to the eigenvalue 
(j + 1), whereas, by definition of j, there is no such eigenvalue of A;, 
so that we must have 


A,¥;=0. (215) 


By the lemma, the vectors 


Vina = Agvys Vig = Ag V1 --- (216) 


relate, if not zero, to the eigenvalues (j — 1), (j — 2), ... of operator 
A,. Vector sequence (216) must naturally lead to the null vector in 
the end, since the number of different eigenvalues for A, is not greater 
than 2. We now prove the formula 


Ay Vg = Ot Viegy (k=j,j7--1,7—2, adie) (217) 
where the 9, are integers. This is true for k = j by (215), in which case 
o, = 0, whilst the null vector can be taken, say, for vj,,. We now 
suppose that (217) is true for any of the & concerned, and show that 
it is true for (k — 1). We have by (213), (216) and (217): 

Ay Vy—y = Ay (Ag Vx) = (Ag A; + 243) % = 
= Ay (Ay Vu) + 24g Vy = Ag (On Vegi) + 2hVE = (0% + 2%) Ve. 
We remark that, with k = j, we make no use of the expression 
Ag ¥n41 = Var 


because 9, = 0 for k= 7. Equation (217) is thus proved, and the 
numbers 9, are defined by the relationships: 


Ox-1 = On -+ 2k; 9, =O (k=j,7—1, ...). 
We obtain by calculating the values successively: 
Oe =I +1)—k(K+1), 
i.e. : 
Ayy,=GGTU—KE+ MV] Veg, (F=F,7—1,.--)- (218) 
We use these equations to find the subscript s of the first of vectors 
(216) that vanishes, i. v, = 0 and V,,; #0. It follows from (217) 
that o, = 0, ie. 
77+1)—s(8+1)=0. 


86] INFINTTESIMAL TRANSFORMATIONS AND REPRESENTATIONS OF THE ROTATION GROUP 303 


This is a quadratic equation in s with roots s = j and s = —(j + 1). 
The value s = j is unsuitable, since v, is not zero and does not appear 
in sequence (216). Hence the vectors of (216): 


Vis Vj—1> sey V_j41> v_j (219) 


differ from zero, and A, v_; = 0. There are (27 ++ 1) of these vectors, 
whence it is clear that j is either a non-negative integer or half a 
positive odd integer. If 27 + 1 =n, we can take vectors (219) as the 
fundamental set in space #,. On the other hand, if 27 -+ 1 <n, they 
form a subspace L,;,, in R,. Suppose that this latter case holds. Each 
v, of sequence (219) satisfies the equation: 


A, Vv, = ky, (k=j,j7—1,...,—7+1, —f)- 


We have further, A, ¥, = v,—-,, where v;_, = 0, together with (218). 
The operators A,, A, A; displace subspace L,;,, into itself, and the 
above formulae fully define the operators in the subspace. Moreover 
it follows at once from (216) and (218) that there is no subspace L,, 
inside L,;,,, where 0<k&< 27+ 1, which remains unchanged on 
application of operators A,, A,, A,. Having found the A,, we can con- 
struct for the subspace L,;,, the equations (204), which are necessarily 
satisfied by the vector 


x= Fy (ay, dy, a4) 0 (220) 


of the required representation in L,;,,. This representation can leave 
no subspace L, of L,;,, invariant, i.e. it is irreducible in Z,;,,, since 
otherwise every A, would have to leave Z, invariant, and this is untrue, 
as we have just seen. If 2j + 1 = n, the above discussion applies to R, 
as a whole. With 2j + 1< 17, we have distinguished from the total 
representation in &, a representation of order (27 + 1) which is irre- 
ducible in the sense indicated, i.e. it leaves invariant no subspace L,, for 
which 0 < k < 2j + 1. A direct consequence of our arguments is that 
there exists only one irreducible representation, leaving aside similar 
representations, of a given degree. Yet we have already formed irredu- 
cible unitary representations of any given degree in [69]. 

These in fact account for all the possible irreducible representations, 
and the representations based on operators A, that we have con- 
structed in Z,,,, must be similar to them. 

Vectors (219) can be multiplied by an arbitrary non-zero numerical 
factor. In this case, numerical factors also make their appearance in 
(216) and (218). The factors can be chosen so that the following 


304 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [86 


relationships are finally obtained: 


Aye = VIG +1) — BEY) vag 
Ave = ViG+ I —k(E-1) 4, (221) 
Ag Vp = kv, 

where vj,, = 0 and vj—, = 0. 


We obtain with this choice of factors the same representations as 
were formed in [69] by starting from the quantities 


m= ae (222) 
The construction indicated above makes it possible to separate out 
the irreducible parts of any representation. All in all, it is a ques- 
tion of seeking the eigenvectors v; of operator A, with the maximum 
eigenvalue and forming (216). 


86. Representations of the Lorentz group. We take the group of 
linear transformations with unity determinant: 


ame Bigs a ery (223) 
%, = cx, + dz, 
A transformation matrix contains four complex coefficients, with one 
relationship between them. Since three complex quantities remain 
arbitrary, we have six real parameters. We introduce these para- 
meters by denoting a transformation matrix as: 


Ge homes ita 


(224) 
| Gg -+ dag, d(a,) 


where 

(6) = lat, 
We obtain six infinitesimal transformations J,, which are readily con- 
structed. To find say I,, we have to put all the a, except a, equal 
to zero in A, differentiate A with respect to a,, then set a, = 0. 
We find in this way that 


1 0 D 0 0, 1 
L= ? |; k= ’ ; R= ? 5 
: [° 4 ' I; —% ; lo | 
, 0, @ ; 0, 0 ; 0, 0 
nel? ffs z=f2°}; n=l oh, 

0, 0 11, 0| i, 0| 


87] REPRESENTATIONS OF THE LORENTZ GROUP 305 


The structural constants of appearing in (208) must be real by de- 
finition, and they can be determined from the relationships: 


6 P 
y—-Klp= SOR (w<ai q= 12-8). 
= 


It must be observed here that no linear (non-trivial) relationship 
with real coefficients exists between the matrices Jj, so that the 
following fifteen relationships can be obtained: 
Bi = 2 i; —Gi = 2h, In 1,—I31, = 2%, 
hig-Rih=-—a, GMi—-Rlh=—W, 11;-41,= —2J5, 
Ig J; — 1513 = {i, I,1,—-Io4,=h. i;—Ijyy=L, 
bhhia=hh=-ty, Mh -hmh=0, 
1, ~— 141, = 25, ii, = f,1,=0, 
Liy-hma=—-l, l—mig=0. 

If J, (k =1, 2,...,6) are infinitesimal transformations for any 


representation of the group in question, they will also be connected 
by fifteen relationships 
6 
T1,—I,1, = 2 ONT; 
with the same coefficients CY. If we introduce the notations: 


I,+41,=24,;; 1;+¢1,=24,; 1,+74I,=443; 


225 
I, — iI, = 2B,; I, — tl, = 2B; I, — tl, = 4Bs;, ( ) 
the fifteen relationships may be written as follows: 
A,B,—B,A,=0 (p, g=1, 2, 3) (226) 
together with the six relationships: 
A,A, — A,43= Ay, BB, — B,B,= B,, 
As3A, = A,As = Ay; (227) BB, = BBs Ss B,, (228) 
A, A, = A,A, = 2As,, B,B, — B,B, = 2B;. 


Notice that relationships (226) and (227) are satisfied trivially if we 
take the matrices Ij, since in this case Aj, = 0 (k = 1, 2,3). Relation- 
ships (227) are the same as (213) and the arguments of the previous 


306 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [87 


section remain in force. We apply the relationships to the infinites- 
imal transformations of any linear representation of group (223). 
If v; is the eigenvector of operator A, related to the maximum eigen- 
value, there are (2j-+1) eigenvectors v, (k=j, j7—1,...—j+1, —j) 
of operator A, which are transformed by operators A,, A,, A, in 
accordance with (221), where vj,, = 0 and v_;_, = 0. Let Lj) be the 
subspace formed by all the eigenvectors of A, related to the eigen- 
value j. We show that if v belongs to L”, the vectors B, v (¢=1, 2, 3) 
also belong to L®. For, by (226): 


A; (Byv) = B,(Agv) = B, (jv) = 7B,y, 


whence it follows that B,v is the eigenvector of A, corresponding 
to the eigenvalue j (or is the null vector), i.e. By ¥ belongs to LY. 
We can repeat our arguments of [85] for L™, replacing operators A, 
by B,. The vector sequence vj (k’ = 7’,j’—1,.--, —j’ +1, —7’} 
can thus be formed in L”. the vectors being transformed in accor- 
dance with (221) with j replaced by 7’ and A; by By. On repeated 
application of the operator A,, each vj, yields (27-+ 1) vectors vin, 
(k= 7,7—1,..., —7 + 1,— 7). We thus finally obtain (2j + 1) 
(27’ + 1) vectors v;,,,, for which the following relationships hold: 


Ave = V7 (G+ 1) — (E+) Vat, k 
Ayre = VIG+ DY) — kN) vp ws 
Ag ye = FeV is 

Bye =VIG+tD) — KE +1) Va, kt) 
Bove = V7(G +1) — BR — I) ve 
ByViae = Vy - 


(229) 


These expressions define operators A, and B, in a (27 + 1) (2j’ + 1)- 
dimensional space, and operators J, are defined in accordance with 
(225), after which equations (204) can lead only to a single linear 
representation of the group. This is the representation that we formed 
in [80]. 

We have followed in these last sections the lines of the treatment 
in van der Waerden’s Die gruppentheoretische Methode in der Quan- 
tenmechantk (Springer, Berlin, 1932). 


87. Auxiliary formulae. We return to the formulae of [82]. We have 


GG.=G,; (230) 
the y, being given in terms of a, and 8, in accordance with (189) or (193), which 


87] AUXILIARY FORMULAE 307 


define the basic group operation. We form a matrix which we denote by 
S(G,, G,), depending on variables a, and §,, ie. on the group elements G, and 
G,; the elements of the matrix are given by the following formulae: 


_ tka 
Siz (Gp, Ga) = @B, (4,4 =1, 2,...7r). (231) 
We have already considered this matrix in [82] for 8, = 0, ie. with Gz = E, 
where E is the identity element of the group. Let us investigate the properties 
of the matrix. An immediate consequence of the definition is: 


S (Gp, E) =I. (232) 
We show that also: 
S (Gg, Gq) : S(E, Gg) = & (£, GgGq)- (233) 
We put G, = G,, G,,, 60 that 
Gy = GpGa = (GgGq-) Ga, = ya, (Gy = GpGa.)- 


‘We use the rule for differentiating functions of a function: 


Oy; Oy; 06; = 
pad dee paca pa Eee Sis (Gs, Ga) Ssp (Ge-G ae), 
OB, = 88, 2B, = is ( é a’) Ssp ( B a’) 


whence 
S (Gg, GaGa) = S (Gg Ga-) 8 (Gg, Ga-)- 


If we put Gz = EB, G,, = Gz and G,, = G, in this equation, we obtain (233). 
With@, = Gz}, we get an expression for the inverse matrix to S(Z, Gz): 


S-1 (E, Gg) = S (Gp, GB’). (234) 


Matrix S(#, Gz) becomes S(,) in the notation of [82], and the inverse matrix 
becomes T(£,). We shall write these at the moment as S(@,) and T(G,): 


S(z, Gs) = 8 (Gg); S1(£,G,) =f (Gp). (235) 
We have 
S(G,) T (Gp) = 7 (Gs) 8 (Gg) o= Z. (236) 
Equation (233) gives 
S (Gp, G,) = 8 (E,G,) 8 (E, Gs) =S (Gy) 8-1 (Gg), (237) 


and (231) can be written in the form 
on 3s (G,) 5g (Gp) (238) 
OB, = Es Tay) * sh \V pls 

On multiplying both sides by 7,,,(G,) and summing over i, we get by (236): 


> Pri (Gy) Got = Tt (Gp): (239) 
{=1 


308 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GRours [87 
We differentiate (238) with respect to §;: 


ay; oa < OSis (Gy) Oyp 


OT x (Gp) 
0B, OB; — aoe Oyp OB, Tsx (Gg) + os (G, ) =r 


eB,” 


whence, using expression (238) for dy,/0;: 


Oy; an aS; (G,) a OT cx (Gs) 
ara = EE 5 (G,) Ty (Cp) Pu (C Sec (Ga) CEH EP) 
08,08; Pe er Op 'pq (Gy) qt ( p) sk ( at > is ( y) 2B, 


On interchanging & and J on the right-hand side, using the independence of the 
left-hand side on the order of differentiation and interchanging the variables 
of summation s and g, we get 

~ [ss OSis Ys 


5, PP» = 


24s) ee 


Spq (Gy )— Sys G@y)|7 @ (Gp) Ps (Gg) = 


= : OT (Gp) _ OTs: (Gp) 
=~ 35 (Oye ae. 


We multiply both sides by the product S,;(@5) Sug(G a) Tri(G,) and sum over 
t, k and 7 from 1 to r. Taking (236) into account, we get the equivalent system 
of equations: 


— | OSig (Gy) OSi; (Gy) 
Pere ren SE Sn @)] Pm) = 


ents SS OT (Gp) 02m (GA) 
== 3 51(0p) 869 | Bee - “SI. 


We easily pass from these equations to the above, by multiplying both sides by 
the product T'p,(@g) Tyz,(@g) Sia(@g) and summing over f, g, and h. In the 
latter system, the left-hand side depends only on y, and the right-hand side 
only on £,. Hence, in view of the arbitrariness of G, in (230) and the independence 
of B, and y,, both sides of the last. equation must be equal to the same constant, 
and in particular: 


. On (Gp) OT 4 (Gp) h) 
2 S00 5un 6p | age — Be] =o. 


We can write, with a change of subscripts: 


- OT ps(Ga) OF, (Ga) 
> Si Ga) Sep (Gq) | 2 — ee | ae _). 240 
Pet ti ( ) re ( )| 8a; F |- ce ( ) 


If we put G, = FE in this identity, ie. 4, = 0 (¢ =1,...,7) and use the fact 
that S(z) = £, we get 


_-op~= [as (Ge) _ OF pi OP (Ca). 
ik Qa; 8a, 


88] THE FORMATION OF GROUPS WITH GIVEN STRUCTURAL CONSTANTS 309 


On comparing with (199) of [82], we see that CYP are the structural constants 
that we defined above. On multiplying both sides of (240) by 7(@,) Tim (G@.) 
and summing over 7 and k, we get by (236): 

82pm (Ga) OT (Ga) 


pla) _ SE (Ce) _ > CLT (Ga) Ten (Ga): (241) 
m i,k=1 


We return to (207) and (208). As we saw, (208) is obtained by equating the 
square bracket in (207) to zero with a, = 0 (¢ = 1, ..., r). It is easily shown by 
using (241) that ¢ follows from (208) that the square bracket in (207) in fact 


vanishes for any a,. 
We write the second term of this bracket as 


T Tr 
D> Vil edlite— LS Ti Tiolile 
fik=1 jk=1 


the argument G, of 7’ being omitted. If we interchange j and & in the second 
term here, we get 


r 
> Pela Gl,—-LIy= > Tip Pq? Gps 
Jj k=1 pks=l 


On transforming the first term 


ze a 


Oa, 


in the bracket of (207) in accordance with (241), we at once get the same result 
but with the reverse sign. We consider, along with S(@ 3» Gq), the matrix 
S’(@g,G,), the elements of which are given by 


Vv 
gt = 8'(CpG,). (242) 


We can prove, precisely as above, the expressions: 
S’ (£, G,) = I. 
S’ (Gg, Ga, E) = 8’ (Gg. Gq) 8G, B), (243) 
S’—1 (G,, BE) = 8’ (Ga, G,), 


which we shall require later. 


88. The formation of groups with given structural constants. The present 
section is concerned with the general outlines of the problem of constructing 
@ group operation and a group of linear transformations with given structural 
constants satisfying (194,) and (194,). The construction is based on the theorem 
previously mentioned from the theory of partial differential equations, which 
we now proceed to formulate. 


310 ‘THE BASIO THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROurs [88 


Suppose we have the following system of partial differential equations 


Ft ig eyes Oy Pye --s Ed (244) 
ke 


(@=1,2,..., m; k=1,2,..., ). 
We use the system to write down the condition that 


Pz 82; 


It evidently has the form 
OX i, x OX, Gz, Oy 


Gr, T , Oz, Oe i, +S Gz, Oa,” 


s= 


or, substituting for 02,/dz, and 02z,/0z, from (244): 


Xn, , OX OXy | % eXy 
Sap ee +35 =. kl). (245) 
ez, ZA, Ge, oS eig. ae ae ( ( 


This equation expresses the relationship between variables a, 2;. 

Tumorem. If the Xj are continuous functions and their partial derivatives 
that appear in (245) are continuous at and in the neighbourhood of x, = 2, 
z; = 2, and if all relationships (245) are satisfied identically with respect to 
Xp, z;, then system (244) 73 uniquely soluble for the initial conditions 


The satisfaction of all equations (245) as an identity with the continuity 
conditions mentioned is generally known as the condition for complete integra- 
bility of system (244). We now sketch out the construction of a group operation 
and of a group of linear transformations with given structural constants. 

Let the structural constants CY, where 7, k, p = 1, 2, ..., be given, and let 
(194,) and (194,) be satisfied by the constants. 

We can verify by solving system (241) with respect to the partial derivatives 
that the relationships mentioned represent the condition for complete integra- 
bility of the system. Thus there exists a unique matrix 7(G,) with elements 
T,4(G.) (Pp, ¢ = 1, 2,...,7) which becomes a unit matrix with G, = £, ie. with 
a, = 0 (¢ = 1, 2, ..., r) and which satisfies system (241). Having found 7(G,), 
we can form its inverse S(G,) = T- (G@,).To obtain the group operation, we return 
to (238). The right-hand sides of these equations are known functions of 6, 
and y, (¢ = 1, 2, ..., r). It can be verified that system (241) expresses the condi- 
tion for complete integrability of system (238). Hence there exists a unique 
solution of system (238) which satisfies the initial conditions 


=a; 


Vi eo f° 


This solution in fact gives the group operation. The initial conditions express 
the fact that element G,, defined by (230), becomes G, for 8, = 0 (s = 1, 2, ..., 
r). We now pass to the formation of the group of linear transformations, i.e. 


89] INTEGRATION OVER GROUPS 31] 


of the group of matrices of a given order, for the given structural constants, 
the matrix 7(G,) being already obtainable as shown above. As we pointed out 
in [83], the condition for conplete integrability of system (204) or (205) amounts 
to the vanishing identically of the square bracket in (207) for any choice of sub- 
scripts, whilst this last condition is fulfilled, as we proved in [87], if matrices 
I, satisfy relationships (208). The solution of the problem must therefore begin 
with the construction of matrices I, of a given order satisfying (208). This is 
a difficult algebraic problem. Having found the I,, we can then assert that system 
(205) has a unique solution satisfying initial conditions (206). This solution 
in fact gives the matrix group with given structural constants Ci). 

It can be shown that integration of system (24]) with initial conditions 
T(#) =I amounts to integration of a system of ordinary linear differential 
equations with constant coefficients. We state the result. We form the system 
of ordinary linear differential equations with constant coefficients: 


dw;,{t) r : 
a) Sig + > CD) ay welt) , 
P,q=1 
where 6;,= 0 fori #£k, 6;,;=1 and a, a,,...,@, are assumed to be given 


constants. The functions T;,(a,) = w (1) now satisfy system (241) and initial 
conditions 7(#) = I. A detailed treatment of the problem of forming a continu- 
ous group, given the structural constants, together with various other problems 
of the theory of continuous groups, may be found in L. 8. Pontryagin’s Nepre- 
ryvnye gruppy (‘Continuous groups’). 


89. Integration over groups. We proved in (76, 77] a number of relationships 
which contained the sums of various quantities depending on the elements of 
& group, the summation being extended over all the group elements. In the 
case ofa continuous group, the summation is replaced by integration with respect 
to the parameters defining the group elements. Let @ be a continuous group 
such that there corresponds to it for some choice of parameters a bounded 
closed domain V (the domain with its boundary) in real r-dimensional space 
1, defined by parameters a,, ..., ¢,, with a definite point of V corresponding 
to every element of G and vice versa. The functions ?(Fis 2 ney PpiGys sey E) 
defining the group operation are assumed continuous and differentiable a suf- 
ficient number of times inside V. Furthermore, these functions and their deriv- 
atives are assumed continuous up to and including the boundary of V. The 
parameters @,, corresponding to the element Gz}, are assumed to be continuous 
functions of the parameters a, A group with these properties is generally said 
to be compact, To define integration over the group, we consider the determinant 
of matrix S’(@, G,) [87] and introduce the following notation for this: 


r 


A’ (Gp Ga) = | Rt ; (246) 


We have directly from (243): 
A’ (E,G,) =1, (247,) 
A’ (Gg Gy, E) =A’ (Gp, Gq) - A” (Gq, E)- (247,) 


312 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS (89 


Using the notation 6(G,) = A(G,, #), we can write 


6’(GpGq 
4 (Op G.)—= yee. (248) 
Hence, observing that 6(Z) = A’(Z, Z) = 1, we get: 
1 
A’ (Ga, G,) = FG) * (249) 


We introduce the further notation: 
u’ (Gq) = A’(Ga', Gq). (250) 
In view of our above assumptions, w’(G,) is also a continuous function in the 


domain V. It does not vanish, since 


1 7 —_, ’ 
TG = Ga) = 4" Ga B) 
is also a continuous function. Observing that u’(#) = 1, we can say that 
u(G,) and 6(G,) are positive functions. By (248), the same can be said of 


AG, &). 
Let f(G,) = f(a, ..., @,)"be any function continuous in the closed domain VY. 
We define the integral of this function over group G by the formula: 


SF (Gq) dGa=JS f(a, ---, a,) u’ (G.) da, ... da, (251) 
G Vv 
where we have the usual integral over domain V on the right-hand side. We 
show that this integral has the following property of left-hand invariance: 
$4 (Ga) dG. =f f (GpGq) da, (252) 
G G 


or in coordinate form: 


S f(a, --., a) u’ (G,) da... da,=JSF(., «+. 5,) u’ (G,) da,,...da,, (253) 
Vv Vv 


where G, is any fixed element of G. We replace the variable element G, in the 
left-hand integral by the variable element G;, setting G, = Gs G,, the domain 
of variation of parameters 6,,..., 6, being V as before. The transformation 
determinant is 


Oa; fr _ — & (GpG,) _ wv’ (Gs) _ sw’ (Ga) 

85, 1 = Aa (Gp, Gs) = 0’ (Gs) = ue’ (GgG3) = wu’ (G,) 3 
and we get 
Sf (ay «++ s a) w’ (G,) da, ... da, = Sf (a, vee 5) w (Gq) H (G2) dé, ... dé, = 
Vv Vv u (Gq) 


=) f (GgGs) dG. 


This is equivalent to (252). The replacement of G, by G, on the right-hand 
side is of no importance. 


89] INTEGRATION OVER GROUPS 313 


Integrals invariant on the right are similarly formed. We introduce the deter- 
minant of the matrix S(G5, G,): 


by: | He 
A (Gg, G,) = OB, i (254) 
We have as above: 
A (Gy, £E)=1 
A (E, GG.) = A (Gg, Gq) A (E, Gg) (255) 
_ 9 (Gp, Ga) 
A (Cz, G,) ae ~ 8(Gg) ’ 
We introduce the positive function: 
= 1 = 1A ~ 
uu (G,) = OG.) =A (Gy, Ga )> (236) 
and the integral is defined by the formula: 
Sf (a, ...,0,) u(@,) da,... da, = § (G,) dG. (257) 
Vv G 
The tilde over the differential distinguishes the integral from (251). 
We now have the property of right-hand invariance: 
J (Gq) AG = J 1(G.Gp) AG, (258) 
G G 


We now show that replacing G, by G7! under the sign of the integrand produces 
a transformation from a left to right-hand invariant integral, and conversely. 
We differentiate the equation G, = G, Gz, written parametrically, with respect 
to a,, it being assumed throughout below that G; = G7}: 


0A; - 4; OB; = 0A; 2 r OA: Ir OB, r 
og + 2 ape ag mbeme® see f= 2 [sae h [aoe 
so that we have, on taking (246) and (254) into account: 
5B: |r ~_1Vy A(Ga, Gz) r _Uu(Ga) - 
“aay fh) ae Gay ERY” a 


This determinant can be written in another form. It follows from the equation 


OB; |r 


r 


0a; 


Ga, |1 | OB, |1 rs 

that 

0a; id pu u(Ga) 1) 

= _ 260 

a, (ay ve 
or, on changing the places of G, and GZ!: 

Op. r ¥(Ga) 

dag ~)) u(Gaty (261) 


We now return to the integral. We find, on changing the variable of integration 


314 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [89 


in the usual way and using (260): 


Jie) ease. da, = JB. Bawrcn) || ae Oe i< apis ap 
= ey 0 _u(Gp)_ 
aes (4B, .--, By wG.) = w(G) df, ...d Br 


Vv 
On cancelling u’(G@,) and replacing the variable element G, by the variable 
G,, we get 
SHG, -.-, G) u(G,) da, ... da, = J f(a, ..-,4,) u’(G.)da,...da,. (262) 
v Vv 


We have similarly, on taking (261) into account: 
SHG, -- 8) w(Gq) da... da, = S Ha, -..,¢,) u(G,)da,...da,. (263) 
Vv V 
We have made no use so far of the compactness of the group. The domain V 
may even be infinite, though we now have to assume that function f(a,, ..., a,} 


is such that all the integrals written have a meaning. We use the compactness 
to show that u(G,) = u’(G,). We consider for this the determinant 


D(G@,,G,) = ae ie (264) 


where G, = G1 G, G,, and we prove that 
D(Gg, Gar Ga) = D(GE Og Ga, G-) D(Gg, Gar) - (265) 


We can write: 
Gp = (Gor Ga) G,(Ga- G,)=G2 Gy Ga, 


where G,, = Gai Gz Gas, 0 that 


One |_| Ope |r ae 
a oa li: 1 3B, = D(G,, Ga) D(Gg, Ga-), 
whence (265) follows. On setting G, = EH in (265), we get 
D(E, Gye Gy) = D(E, Gy) DE, Gye) « (266) 
If we introduce the numerical function of an element: 
(Gq) = D(E, Gz), (267) 
we can write, by (266): 
(Ga Ga) = 9(Gae) (Ga) » (268) 


Le. corresponding to multiplication of elements we have multiplication of the 
corresponding values of the function 7(G@,). We clearly have: 


n(Z) = 1 and (G,) (Gz!) =1, (269) 


and 7(G,) is continuous and positive in the closed domain V. We now show, 
using the compactness of the group, that 7(G,) = 1 for any element G,. Sup- 


89] INTEGRATION OVER GROUPS 315 
pose, for a given G,, we have 7(G,) #1. If say 7(G,) <1, then by (269): 
(Gz) > 1, and we can always assume that 7(G@,) > 1. With this: 

n(G@2) = [n(Gq)]" + © for n> 0. 


This contradicts the fact that the function 7(G,), continuous in the closed 
domain V, must be bounded. We next consider the relationship between 
u(G,) and u’(G,). Let 


G, = Gp Go = Gz! (Ga Gg) G,= Gi G,G, (G, = Gy Gp) : 
We have: 


On ee 
2 oe A(Gp, G,)- 
But on the other hand: 
oy; |r ogi | 
OB, | OB, | 
A(Gg, Gq) = D(Ga Gp. Ga) AG: Gp) - 
We obtain on setting Gz = G!: 


oy, \F 
00, F 


= D(Ge, Gq) 4’(Ga Gp), 


A(Gq1, Ga) = D(E, Gq) - AG, Ga") , 
ie. ‘ 
(Gq?) = n(G,) u’(Gq") or = u(Gg") = u’(G) 


for any G,, since 7(G,) = 1. Thus for compact groups, the left invariant inte- 
gral (251) is the same as the right invariant integral (257). Moreover, it follows 
from (262) and (263) that this integral is the same as 


Sf, ..., &) u(G,) da, ... da, . 
4 
The left and right invariant integrals may be different for non-compact groups. 
We take as an example the group of linear transformations of the form 
2’=e%t*z+a,, 


where a, and a, vary from (— 0) to (+). Here, r = 2 and V is the whole 
plane. The composition of two transformations gives 


2/=s ez + a3; “= ef 27+ Bb; n= 9(B, Be; a). a) — By + a; 
Le. 
cad = fitz T (es ay _ B.), v2 %2(B1, Be} a, a.) =e 7 By + ds . 
Parameters a, = a, = 0 correspond to the ey element. The element 
Gr! has parameters @, = —@,, @. = —a,e-%. We evaluate the functional 
determinants: 


1, 0 
V(Cp. Ga) = i ges| TOT} OG) =O w(G.) = 07% 


11, eB 


Gp Gd=| 0 


|=": (Gq) = u(G,) = 1. 


316 THE BASIO THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [90 


The left invariant integral has the form 


+0 $2 
s 5 f(a4, a.) e~ da, da, 


and the right invariant integral: 


+eoo +e 


f § {(@,, 4,) da, da,. 


—s —oo 


We observe that requirements other than compactness of the group are 
possible in the proof of the equality of the right and left invariant integrals, 
ie. of the equation u(@,) = u(G,). 

Let G’ be the subgroup consisting of elements of G of the form 


G, Gp Go" Gt (270) 


or of products of these elements, G, and Gs being any elements of @. 

It may readily be seen that, if an element G,, is included among elements 
(270), so also is G>'. 

Similarly, Gy! G, G,, for any choice of G, of G, is likewise included among 
elements (270). It follows from what has been said above that @’, the sub- 
group generated by elaments (270), is a normal subgroup of G. The subgroup 
@’ reduces to the identity element when and only when all the elements (270) 
are identity elements, i.e. when and only when @ is an Abelian group. Sub- 
group @’ may possibly be the same as G. This is true, in particular, if G is a 
non-Abelian simple group. Subgroup @’ is generally known as the derived 
group of G, 

It follows from the definition of (268) and (269) that (4, G3 G71 G>"),= 1, 
that 7(G@,) = 1 for all G, of @’, and that 7(G@,) has the same value for all ele- 
ments belonging to the same set with respect to G’, i.e. 7(G,) has the same 
value for any element of the group complementary to G’. If @’ is the same 
as G, 7(G,) = 1 for any G, of G. The same is true if the complementary group 
just mentioned is compact. But since 


n(Ga) = 1, we have u(G,) = u’(G,) . 


90. Orthogonality. Examples. The property of left and right invariance of 
the integral is analogous to the property in the case of finite groups that the 
product G, G, or G, G, of a variable element G, and a fixed element G, varies 
over all the group elements. We used this property to show that every group 
representation is equivalent to a unitary representation, and in proving the 
properties of orthogonality. Similar propositions can be proved for compact 
groups by using the invariant integral. If A(G,) are unitary matrices yielding 
an irreducible linear representation of a compact group G, and B(G,) are unitary 
matrices yielding a non-equivalent irreducible representation, we have the 
following expression for the orthogonality of the non-equivalent irreducible 
unitary representations, the matrix elements being denoted by two subscripts 
as usual: 

S {AG} {BED} u(Ge) da, ... da, =0. (271) 
Vv 


90] ORTHOGONALITY. EXAMPLES 317 


We obtain for a single irreducible representation: 
a Six Fit 
{A(Ga)}ij {4(Ga)}u wa) day «da, === J u(Gq) day+da,, (272) 
Vv Vv 
where p is the order of the matrices. Similarly, we have for the characters: 


p q 
X(G) = Py {AG,)}is XG) = a, {BG@a)}i 


where p and g are the orders of matrices A(G,) and B(G,), and these have the 
following properties: 


§ X(G,) XG, u(G,) da,- da, =0, (273) 
Vv 
J X(G,) X(G@,) u(G,) da, -de,= ; u(G,) da, - da, . (274) 


1. We now consider some examples. Let G be the Abelian group of rotations 
of the plane about the origin. Here, r = 1, and the single parameter is the angle 
of rotation a. We take a as belonging to the interval (0,22), the ends of this 
interval being regarded as identical. Successive rotations by angles a and f 
amount to a rotation of § -+ a, where the sum must always be made to be- 
long to (0,2) by subtracting 22 if necessary. The functional determinants 
A(@g,G,) and 4’(G,, G,) here reduce to the derivatives of 8 + a with respect 
to f or a, ie. to unity, so that u(@,) =u’(G,) = 1. We know that @ has irre- 
ducible unitary representations of the first order ef" (m = 0, +1, +2, ...), 
and (273) and (274) give the familiar expressions: 


2x 2a 

[em a gin, *de=— i elm—m) a gg { 0 for m, #m, (275) 
| 2n for m, = m,. . 

9 0 


We remark that, because of the need for making the sum f + a belong to (0,27), 
we may have a singularity as regards the continuity and definition of the deriva- 
tives of the sum; this occurs when the sum is equal to 2x for a and # lying 
inside the interval. 

2. We consider the group of rotations of three-dimensional space, using 
parameters rather different from those mentioned in [84]. Let the space be 
rotated by an angle w about an axis forming angles a, 8, y, with the coordinate 
axes. 

We introduce the four parameters: 


er ee ae =ccaden ay a, = cos f sin + a a, = cosy sin +. (276) 


2 2 
These are connected by the relationship 
of +a +a}+ag—1. (277) 


The values a, = 1, a, = a, = a3 = 0 correspond to the identity transformation. 
We can take a,, @,, @, as the parameters, and a, as @ function of them. 


318 THE BASIC THEORY OF GROUPS AND LINEAR REPRESENTATIONS OF GROUPS [90 


If the rotations defined by parameters (a), a); dz, 4;) and (bp, 6,, 6,, 63) are 
carried out successively, the parameters (cy, ¢,, C2, ¢,) of the resultant rota- 
tion moay be readily seen to be given by 


Cy = My by — 4, 6, — az bz — 25 b,, Cz = a, bz — a, by + az, + 24d), 
©) = Oy b, + 4, by + 4, bg — Gs be, Cy = Gy bg + 4, by — A, b, + 43 dy 
We find from (277), treating a, as a function of a,, as, a3: 


(278) 


Ga, ; 
4 Gq, + =O (j= 1, 2, 3), 


whence 04,/da, = 0 for E. Using this, we can easily form the functional deter- 
minant for b, = 1, 6, = 6, = 6, = 0: 


D (c4, €:, €3) 2: ” ‘ Pe 
Dy, b,,8,) || 2 G0 — Hf = 0 (88 + Ok + 25 + 28) 
— Gp a, ay lt 


=a, = V1 — a? — a? — a2. 


: ay — 43, Az 
il 


The invariant integral becomes 


1 


Yi—at —ai ak da, da, da,. (279) 


J Fle. en.0,) 
Vv 


The domain JV is the sphere with centre at the origin and unit radius. We remark 
that expressions (278) are obtained directly from the rule for multiplying qua- 
ternions: 


Cot Ott Coit Cy k = (ay + 4, % + G29 + 3k) (by + B,t + Boj + 85k), 
the unities 7, j and & being subject to the following multiplication rules: 
PafePt=—1; j—=—fieks jfk=—hj=t; Ws —tkh=j. 
It is easy to establish the connection between parameters (a. a), dz, @,) and 
the Eulerian angles a, £, y. The expressions are 
ay = cos 5 Bcos-5 (a-+y); a, = sin Pein (y ~ a); 


ae | a 5 renee | 
a, = sin = B cos > (y — a); a, = cos Bsin> (e+). 
The invariant integral may now be written in parameters (a, 8, y) as 


[tle 8.7) sin Paint 5. (a — p) da dp dy, (280) 
v 


where Oc a<27,0< B<xn, 0< y» < 22%. Wa note that in integral (279) 
the function 
1 


V1 — a} — a} — af 


2s 
ay 


becomes infinite if w = 2. This is related to the fact that we have sin w/2 


90] ORTHOGONALITY. EXAMPLES 319 


instead of w in expressions (276) for @,, as, @3- It is worth mentioning here that 
the properties discussed in [89] in connection with the definition of compactness 
only need to be fulfilled for a certain choice of parameters. The properties may 
well be lost. on changing the parameters. Furthermore, the singularity in the 
continuity and definition of the derivatives mentioned at the end of the first 
example in connection with the group of rotations of the plane about the origin 
will also hold for the group of rotations of three-dimensional space. 

It may also be remarked that the equality of the invariant integrals for the 
spatial rotation group has an immediate connection with the fact that this is a 
simple non-Abelian group. 

3. It may easily be verified by direct evaluation that the left and right 
invariant integrals are equal for the Lorentz group which, as we have seen, 
is homomorphic with the group of linear transformations with unity deter- 
minant: 

Ty = Ay, + ay Z| 


, = =). 281 
aw, = ay 2,4 dy 2p (@, a, — a, a, ) (281) 


The values a, =—4a,—1, @,=a,=0 correspond to the identity element. 
We can take a, as a function of a, 4,, a, and take as parameters the real and 
imaginary parts of a,,a,, and a, — 1. The group operation amounts to multi- 
plication of second order matrices, and we have: 


Cy = by dy + 5,42, Cy = bya, +b, Gg, C= bea)+ dyads, cy— bea, + byaq. (282) 
If we put a, = a, + ta, (k = 0,1, 2,3), the group parameters are a,,a,, a3 
aj, a5, 43. On writing further b, = 6, + iff and cy = y, + i, to find the 
invariant integral we must first evaluate the functional determinants: 


D (vi Yor Vy vie Yor ¥5) 4 oe a id 
Foye a oF ar ae f = = = = PU =0; = 
D (Bis Bos Bes ee Bz, B3) or B, B, By Bs BS Bs I 


or 


is Fe I eR a 
for aj) =a,=aj=aj;=—a,;—0;4a,=—1, 


D (Pty Var Yas Vir YoY) 
D (G4, Gy, Oy, Gy, By, Og) 


the fact that a; = 1 and not zero for the identity transformation of the group 
being of no consequence. We get in both cases the seme invariant integral: 
J! (ai, a8, 05, af, 5, 3.) aE dot da; das, daf dag dog. (283) 
The domain Y is the totel six-dimentional space. The equality of the invariant 
integrals is related to the fact that, for group (281), the sub-group @’ formed 
by the multiple elements G, Gz G1 Gg!, to which we referred in [89], is the same 
as the group itself. For it is easily shown that @’ does not reduce to the identity 
transformation or to the normal subgroup formed by elements # and (—£). 
The actual working tofind invariant integral (283) becomes simple on the basis 
of a lemma in which use is made of analytic functions of several complex vari- 
ables (see Chap. IV of the second part of this volume). 

Leama. Let w, = u, + wv, (¢ = 1, 2, ..., &) be analytic functions of complex 
variables z, = x, + ty, (8 = 1, 2, ..., k). The functional determinant (Jacobian ) 


BeatriceGloria_personal library 
320 ‘THE BASIC THEORY OF GROUPS AND LINRAR REPRESENTATIONS OF GROUPS [90 


of functions (u,, 4, .--, Un, Dy) with respect to variables (21, Y1, ..-1 Dk» Yq) te now 
equal to the square of the modulus of the functional determinant of functions 
(w,, ..., Wy) with respect to variables (z,, ..-, 2x). 
We have (see Chapters I and IV of the second part of this volume): 
sa, ONC a a! 
Ory = Oy,” Oxy Oy,’ 
and we can write: 
By) By, Bg Dye ++ age Ore | 


D (toy, Uy oss Ute Pe) _| Puy rr — Pass Ors +++ 5 — Pree Sax | 


STE e777 aa (AE Ae a | _ 
D (a, 4» 2 Zp Yx) On, Bras Gps bys, 2 py, Dex | 


— Dyas Bins — Dyer Bugs 202 — Opto Bee 
where 
Ou; 
ay = a3 5; =>. 
ik Or, ne By 


On adding to each odd column the next even column multiplied by ¢, we get 
the determinant: _ 


Cres Ors Cree Bis --- > Caer Org 
| dears Gy Hig, Dye --- 5 Wry Fy 
[eae ge: Be urea ee Z (Cig = Oy + id). 
Ch: Bey Ores By, sey CK Oy, j 
| tr, Ons tye, Dyes eres ies Gx 
Further, on subtracting from each even row the previous odd row multiplied 
by 7, we get 
Crp Biz, Car Ores ++ 5 Cres Ory | 
O, Cy, 0, Cya,--+, 0, Cyy | 


Cia» Bg. Cre, bys, sees Off Oy, i 
iar ’ 
{ 


O, Cir 0, Chas --- > 05 Ce 
Transferring the odd columns to the left and the even rows to the top, we get 


Cry Cras ~ ++» Crpy Ori Dias --- > Org | 
Cais Coa sess Cake bn, bes, oes s, bop 


whence it follows that 


Cyr - 2+ > Crp _ | Dw, ..., a) P 
Yee — D(z, --+ 5 2) 
| Cyrs--- > Cee | Copp sss Che 


D (ur, ~~. Unere) Bore: 
D(Q yp -+-s Leo Yu) 


90] ORTHOGONALITY. EXAMPLES 321 


We next evaluate the function u(G,) in the invariant integral. For this, we have 
to evaluate, in accordance with the lemma, the functional determinant 

D (¢1, le, €5) Ss pant . => — 

Deri e for b, =}, =1; b, =b,=0 (284) 
or 

D (cy, C2, €3) 


D(a, 43.5) for a4,=a,=—1; a4, =a, =0. (285) 


It follows from the relationship a,a, — a2,a,; = 0 that: 


da, da, ; da, 
Geb eye =o: se ea fo 835,. 9 
We have further: 
dc, ae, = Qc, b 
0a, — Yor da, — Ms 0a, — 
dc, Oa, Oe, _, Ga, Oc, _, ay 
Ba, 8 Ga,* Bag "8 Ga, 1 8 Ba, 8 Oa, 
a, Gey Oc, 
aa be; Bas 0; %a, = b,, 


whence (283) follows. The same result is obtained on the basis of expression 
(284). 


INDEX 


Abelian group 189, 208, 240 
Anti-symmetric functions 43 
Associative law for matrices 78 


Bessel’s inequality 162, 176 
Bilinear forms 142 

Bounded transformation 171 
Buniakowski’s inequality 111 


Cauchy test for vectors 162 
Characteristic 
equation of matrix 100 
roots 
of matrix 100 unitary 145 
of quadratic forms 135 
Characters of representation 277 
Class of group 21] 
Closure 
equation of 168, 176 
generalized 168, 177 
Cofactor 
of element 12 
of minor 15 
Commutative Hermitian matrices 143 
Complete system 
of functions 178 
of vectors 167 
condition for 170 
Composition of linear representations 
262 
Conjugate, complex 47 


Continuous groups 291 
Contragradient transformation 80 
Contravariant 

affine vectors 81 

vector components 83 
Cosets with respect to subgroup 209 
Covariant 

affine vectors 81 

vector components 83 
Cramer’s theorem 30 
Cycle 203 


Derived group of a group 316 
Determinant 

evaluation of 16 

functional 62 

notation 6, 12 

sign-rule for 11 
Differentiation of matrices 295 
Direct product 

of groups 264 

of matrices 259 
Discriminant of quadratic form 131 


Eigenfunctions of operator 185 
Eigenvalues see Characteristic roots 
Elementary divisors 103 

Eulerian angles 73 


Factor group 214 
Fourier coefficients 176 
Functional operator 181 


Gradient of function 82 
Gram’s 

determinant 55 

inequality 57, 58 
Group 

abstract 205 

compact 311 

cyclical 190 

dihedral, of order n 283 


of linear transformations (matrices) 


188 
of permutations 205 
alternating 203 
rotation, simplicity of 252 
simple 214 


Hadamard’s inequality 55 
Hermitian 

form 112 

operator 182 
Hilbert space 159 
Homomorphic groups 217 


Implicit function theorem 66 
Index of subgroup 209 
Integral 

left-hand invariant 312 

right-hand invariant 312 
Integration over group 311 
Inverse element of group 206 
Inversion 

in permutations 3 

of system of functions 64 
Isomorphic groups 216 


Kernel of homomorphism 218 


Laplace’s determinant theorem 16 
Limit 
of complex variable 157 
of variable vector 162 
Linear forms 39 
complete system of 40 
completion of a system of 41 
homogeneous functions 11 


INDEX 323 


linearly 
dependent 39 
independent 39 
operator 181 
transformations 70, 90 
Lorentz transformations 194 
general 198 
positive 200 


Matrix (array) 74 
conjugate (Hermitian) 96 
diagonal 77, 91 
Hermitian 117 
non-singular 90 
projection 149 
quasi-diagonal 97 
rectangular 27 
scalar 237 
similar 92 
transposed 79 
unit 75, 91 
Minor 
of determinant 12 
of order / 15 
principal (leading) 23 
Modulus of complex number 47 
Multiplication of determinants 23 


Norm 
of function 175 
of vector 48, 160 
Normal subgroup of group 213 


Objects of linear representation 233 
Order 

of element 206 

of group 209 
Orthogonal transformations 105 


Parameters, Cayley—Klein 227 
Permutation(s) 6 

basic 8, 9 

inverse 201 
Polynomial, matrix 154 
Powers of matrix 95 


324 


Product 
matrix 76 
permutation 202 
rectangular matrix 28 
transformation 75, 90 

Projection of vector on sub-space 51, 
150 

Pythagoras’ theorem 48, 166, 170 


Quadratic forms 
alternating 127 
constant sign 127 
definitely 
negative 127 
positive 127 
law of inertia for 126 
matrix of 114 


Rank 
of matrix 28 
of system of forms 40 
Rational fraction of matrices 156 
Reduction of matrix 
to canonical form 102 
unitary, to diagonal form 145 
Representation of group(s) 
linear 233 
of direct product 264 
equivalent 233 
irreducible 234 
one-to-one 232 
reduced 234 
rotation 249 
and Laplace’s equation 253 
Unitary, in two variables 242 


regular 281 


Sarrus’ rule 17 
Scalar product 

of functions 175 

of vectors 47, 105, 160 
Schrédinger equation 263 
Secular equation 59 

A. N. Krilov’s method 60 
Similar groups of matrices 193 


INDEX 


Stereographic projection 220 
Structural constants of group 292 
Subgroups 208 
similar 213 
Sub-space 
complementary orthogonal 51 
of vector space 45, 46, 98 
System of first degree equations 30, 31 
homogeneous 36 
interpretation of 49 
linearly independent solutions of 
38 
principle of superposition of solu- 
tions 38 
zero solution of 37 


Tensor 
deformation 89 
of second rank 
contravariant 84 
covariant 84 
mixed 84 
stress 88 
Transformations 
group, infinitesima] 295 
identity 74, 91 
inverse 75 
non-singular 90 
proper 74 
similar 80 
symmetry, with respect to origin 73 
Transpositions in permutations 7 
Triangle rule 111 


Unitary 
operator 182 
transformations 105, 174 


Vandermonde’s determinant 21, 43 
Vectors 
condition for linear independence 
43 
linearly independent 43 
mutually orthogonal 48 
in n-dimensional space 41 


